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(2) 

1 

e%iibT«*t-rs«M*o*w*±EAia*^jRf#i- 

T3k«fc» S^Hi^&0 : ^i6IB#XTV^Sg^»]<D 

sat^at* 
ttfc-rs^Bsa. 

±EWK¥ a#±E*tfc'k f -5 «W**r±EH»l©«Jf*-e 

1ST- * fc±gBK»¥ a iiip^^-r 5 * 5 swrr % c 
lis*® 3] ±E»j»#aH\ 

±EWK¥S^±E«» k-T * ft#:&±E«»l©i«*T 

r*c k£®a fc-r sis*® 1 \am<D¥w%.w a 

frSfrfcfJW-TSC k£t$ak"f SIS*® 1 fcE«©¥ 
C»**5] ±I2©]P¥a«, 40 

c»*«6]-Aiffli:«»u 3K*«Bfcai,T»*fc'r 

©&±E1#a©T-*{cS^T, y$R*tok-r«*W*. 

Lfc±KW*kTsi«*©«tfik* mmtsmtrz so 



ftffl 2 0 0 3 - 2 5 5 9-8 9 
2 

#"P**3W5*»*W»rT*SB 3 ©Xf7 7k, 
±E*t* k f '*«W**±BRSl3MW* k WW L fc k * 

K^ki-5^(cov^T©±iBra«^itif^% ; en ; e 

±Ettift k T 3«*£±EgJ*n©»*T*& 5 k Hff L fc 
k£tc, S^^k-r5 ! i%{*%IEt<^T-tfc±Ifi^ 

atov^r, iiin^B-rsck^WFfRk-rsw^aec 

±E*f* k f 5 *j{**±EH»l©1«*'efc § k ¥>J0r Lfc 
«6fcE«©m£fe, 

JJEWIW H5tf# L fc±E*t* k 

"3<&SaBS*©£»i*fcJ:!>, ±E***k-rS«M*tf 

Zffi&m 6 lcE«©^#&„ 

±EAIS k «BT Sfcttt, <8aWc«i;TSR«e*5 1 
*#tft c k£#ak-f 518**6 £EIR©¥S2Fffio 

*HS*iI LXMmt ? SW*©«i&*±EAIHfr 6 UStff 

-r^wis^ak, 

*n-rn±ffimk-rsft#©a4sms©«Fa%tttti 

±83«rt*©»«?rs±BBi$a©7 f -*Kat3^T, 
W^k * ««M4c«»tr «*ft©si^a k , 

SIBit^Sk, 

±K*H8¥a#af§ L fc±HE*t* k I" M. £ 

& tf ±EE«* atfEttff 5 ±EM»f 1 

±BWK¥BW±EJ«l k f S»f**±ffi«Mft»i* k 
WWUrcktfc, S^^k1-5^ft©WlST?)±IB ! lt 

a©^- * **±kb»¥ ate ^n^ftiBtis k « 

tc, S^mk-rs^f*fi:ov>T©±fBlia#^mffi«: 

±gBE*^aicffitt*-a-*»jfli¥a k ^ix s c k 
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ca*ai 2] ±Kww^«tt> 

[ifcRai 4] ±EWW¥Stt> 

*fg*»*w»w*c tftWfkk-rsM** i i fees© 

d#v h8B 0 

ns#ai 5] ±K*i»^att, 

j8»fcJSUT±IBAnfc©±E^*3i*#tf*-J:Sfc 20 

a«#a**»rrs c tttnmt? 1 1 KEa 

[000 1] 

[fPB©Bf2>&fl5#I?] *f8WH:¥B8«&tf¥g# 
SMtfKD#y hSBtcHU mtfxy^r— r'l'^ 

[0 0 0 2] 

5£x;/£— f'f^Vhn*')' h©tf>fcfi, CCD 
(ChargeCoupled Device) #;<7 j e , V-('£D:fc>^|©& 

[000 3] 

^-f^>^yhn#7 Mcfci^T, »tf* (A* 
fc^o WT, mto ) ©£Mfc*©trf*i:*foSttttT 

teWiS MIS «k ^ fcf § c t ifivt 3 t © 5> n 
©£f £<> 

[0 0 0 4] SfcC©J:5«:*«»*S«ll6*x>^— r 
>rypOhnj!?«y htfiftt^-StlBUT, AffltfaSfr 50 
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[0005] fcc^tf, ^Sg^^-r^ 
siHSHoiw^^oBowfcanTv^o^xi/ir- 

r -T > * > h a sj? >y He $ ■£ & C t tftt L V 
<&•§<> 

[0 0 0 6] £<DtcttM&XHi., Z.—yWftTrWK^P 

t^MUTft^, 1W*©B«Rt;*©*W©S»*fT3 
— Iftxy^-x-ry^y hn^y k<0@M4-f^ 

[0 0 0 7] *aWttJiU:©ja*«*bTftSnfct > © 

8H&tf3*S£8;tttffc:p#'y hSB^UfgLi? tf 
5fe©T*fc£<, 
[0008] 

*fc»dtiii**ru aawss^auTwafc-rsiw* 
©*a*Ara*»e>a»"fs*Ng^ai:* znznttmt 

waa&tf^i&Ett l t v > s K*n©«*© wts-r 5 aa 

s#a«¥a©BMBea%«a(*»fcHa(*t*«a*E 

©m aa»*fc-rsiw*K:W"r**B»*a©KMi 
aa, RtflEtfiitfKit 5 aa# ttaata-^^ 

sww^afc, wa^a^3*»i:"r««n**aa*aft 

t -r 5*#Ko^T©Haf**ta«*Ett#afc 
Efts * s w»¥ a t zmi zkoicLfco 
[0009] c©aac©*aaatt, ip^vyn© 

y f-fc >-9-©Jf JE*ffa©a— If *^ &©gpW 

atTdcfc^fc, ffla©Afc©3ffl§*fflUTaa*Aft^ 
i«*a©*a*saK*a-r*c 
[0010] sfc*awcisv^Ttt, mac*^ 
T,-ABt«Ku aa?*e*fflUTW«fr*iw*© 
£itM£AF«gfre.$rt#-f 5£«ic, wafc-rsiwtoaa 
©aftsmjeowas^njpnaHit, saatuaaK 



5 

t, vrd$mt? zymosan ts m&mt-r* 

[0 0 12] S&fc+fBBfcfc^Ttt, hSsHtc 

tst^T. Aiafc^rs u skwis^ 

■©ttftowis-rswao-r-^tscJv^r, sea***: 

& frgfr * w w -r s mmk t . mm^mwm t ? 
zmwzttMizysfctnmLrctzi^ )$wmmtt% 30 

[0 0 13] C©IS*, C<Du$y h&Bti, 

[0 0 14] 40 

[0 0 15] (1) *HjSS©^{C«t5P4?>y b<D®f& 

m 1 Rtf la 2 tcfc^r, 1 LT*ms©*»k: 

<kS2JE#frl!<DP#'y h*a*U JPftSPP-X-y b 2<D 

P.x<y h 2<D±»fefefc*n€ f nRi;«llS©»8l53.-y 
b 4 A> 4 Btf^n^ftgElSSft, h 

5 BjWi^ftffi£fl«fclR9tttt&n*cfcfcJ:D#l 50 
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[0 0 16] P^gpa--y h ZKfcl^Ttt, tt&±35% 

1 ltfMMSttfttl 2J&^UT»BfS«ii:tuJ:0«lJa 

IrTSCfctcfcoT, ttSS±gB£03£;^ti££-f3P 
3RtfkTyf-*l 4©[s]»)t^n ; en3Siz:^leJ 

[0 0 17] $fc®g|5a-<y h3(is 7^-Al 0©± 

©&7^iX-*A3 , A4 ^nenigirr ?>ci: 

ICioT, @3KSVfii35-rSfc?>y?-$!il 1 7Rtf3--«l 

1 8©@0K:^n^ f naifi:iHHg?"&S£:fc*<-etsj: 

[0 0 1 8] $e>(C^)¥Ea5a->y h4 A, 4 ^ft. 

x-^a 5 , a 6 ^^n^niBij-rsct^.k-p'eHs 

[0 0 19] COW, MeSPax-y h 4 A, 4 Bti, 

icMmwmmz ^^^xmrn^mm^r^^^ 

[0 0 2 0] ^LT#]&eg|3P.x>y h 4 A, 4 Bt?ti, 7 
^jx-^a, =&|g»1-5Ci:fcj;oTBulig|5^03 
(C^-T3-tt2 4©@0fCl3fi*^ 7^faX-?A 
s ^|g»)rSiii:{cj;^TBuBSia5^03(i:^-re<v^|ft 

2 5©lH]'3{i:-?-n ; fniHie^-i*5ci:^T't§J;9{i:^: 

[002 1] i:nic«Lt»ax7h5A, 5Btc 
»*S-r«HIIWiaH»2 6©&7*^iX-^*A9 ~A 

u zn^nmmtzctic&ix, m3ic*?nwc 

it^fS3-tt2 7, P-;Hft2 8&tf tfyf-UfZ^© 

[0 0 2 2] C©«^#P»ari-y H5A, 5Bti, ^g- 
^3 l^LTTM95?r^-r^7U-A3 2*<JHg« 

3 3 4 jtraagSft* C i: C J: 0 «KS*ftT 
[0 0 2 3] Z.tllc£*)&m%te--V h 5 A, 5BK43 
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>yf-tt3 6Rt>*P-;Hft3 7 (DmVlc^ftZM&iLlc® 

CO 0 2 4] -7j, RftgPa- >y b 20{**T»*»« 
■TSIg^-Xl lOWnSflltcfi, B4KStJ:5lc, 3 

t, wm®&Rzfmm®®&£<D®m®&<4 it, io 

r'J4 5 (05) fc&£tf#-y*XfciRSW£ftTfcSiW 
Hazy h 4 2j&*EKStiTVSo 
CO 0 2 5] ZLTCCDffl®3--v b 4 2 tt, &Hf$a 
r-yh (HKttM-y H2, 3fgpn.x-y b3, &lRf$:x 
:x>yh4A, 4BRa*#Wlgi5a--y h5A, 5 B) rtt 
^tl ; fnEK?n^§-9-7'fiJSlg|5 4 3 A~43Dtll 
^tlTfcOv Cne>1frSlJffllgP4 3 A~4 3DfC*fLT 

j24sa«i«E*«ie i/fc d . c n&-tf:raiiiSP 4 3 a 
nt^s. 20 

CO 0 2 6] *fc&"9":7*JW*4 3 A~4 3DHu 

a., tmmnhxisv. SK«uaa-y bfto&r? 

CO 0 2 7] $P>tc^g|5a--y h 3(cti, 0 5tC7jrf<fc 
91:, COD#7 h 1 <0 Tgj tLTBffitSCCD 

(Charge Coupled Device ) A^7 5 0RO* T^J tL 
T®m-ZW?xi*>S MkXSZvttyVS 2%E 30 
frP)&5ttg|5-try9-g|5 5 3fc, rpj kLT«88£t5X 
£-#5 4kft&^tt^nm£<4HK@B^*n, frJSP 
a-<y r- 4 2 F*9tcti. /^f'J-ty*5 5SD*ilPj$]g-tr 

So 

C00 2 8] ^LmSP-b^gPS 3<0CCD*^75 

ott, HH©«H*JMfcu ften^ami^s 1 a£ 

U *K LTt#P>tlfcW§*f s 1 B^/^y*JPgi5 4 
CO 0 2 9] Sft*yf-l!^5 2l4, 0lRtf02tc 
T*3D, a— 1f*»60 rare 5 J * rnp<j fcv^ofc* 

So 

CO 0 3 0] «&Krt»-ty9-»5 7 0/Vyf-y-by9- 
5 5ii. /WxD 4 5©x^;l/4^a^m^lWT*«iai 50 



mm 2003-255989 
8 

y*jw* 4 o fcasuvr , imaie-b 5 6 « % 3 
sua* 4 oicmmt^o 

CO 03 1] MytiffltmA Ott, nSP-tr VitgP 5 3 
<DCCD*y^5 0, vY^n*>5 lRO'^-yf-by 

•9-5 zsfr&^tt^nmt&sttawifl^ts 1 a, ^ 
m^s i B&tfE^*w«#s i cn (j-xt, cn?>^ 

5 7(OAyfU-b y +r 5 5 Rt/JjQilg-tr >"9-|?fr e> * n 

?nitm-$nz^y7-vim®mm i %s 2 A&tffin3ig 

8HJffiMIS2B« (J^T, Ctlt>*£ttbXl*l®*yV- 
ffS2 tcS-^^Ts o*-y h 1 ©JSBRtfrt 

CO 0 3 2] *LT*^>*Wtf4 Ofi, :<0«S 

wfigM^u 4 o a tiswsnT^ssjw^pf 
umuA 3 a~4 3 Dizm&tZo c<Dmm. com® 

=tv> h*{c8^ x %<Dy7fam&4 3 A~4 3 D^l 
^tS7yfaX-^A, -A.4 tfH 
ftStt, Jb>< LTUgPa- -y h S^Tfc&fcSSil*^ 
fcO, §Eg|5a--y h 4 A, 4 B£±fCfetfTctK ^T"T 

co 03 3] &rcc<Dm*'rym®&4 o«, jesstjB 

it>S»Wf*S 3(cScK^£y1.a$fcm;>j£-»i-fc 

^a±o rgj tit«ii-5ia5ax7h3^ 

CO 0 3 4] CKOJc^tLTtKDO^y h- UCfel/^T 

frif<D%m&zfcm-3^xmmc'imt%cttfx% 

c o o 3 5 ] ( 2 ) %tt¥%®micmt ynm» 
4 oosaa 

CO 0 3 6] C<Dnt>7 h ltcti, AkOWIS^jibT 
C D7JP<^ 5 OOW^fcS-^^T^mtfc^OA© 

T> *W*W»LT^ft»/VWffiaA©S**S*U * 
©«f«ftAO*(i(l^©»W«fa&t^<D^«WW« 
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wo' lt^ < ftwwwijwsws nt ^ s. 
fcfcBtTfcfe^Ttt* *©A©J»©£»«1$«fttfS© 
»»IWWfc»«tt»T*M*B»L»*fcA* raja 
<DAJ fc*tf, IEt§USx.T^&^A£ rfr£i&AJ 

[0 0 3 7] *UTCO*W*WWStt, ^MfcWgB 

4 ofc^^a^afcfcon^nTv^s. 
[oo3 8] cct% frftzzm^mmicmtztj 

y*WI»4 0Offla!rt$*«ffiWfc4MS"r*i:, 06fc io 

o^> A©*©^«w«**arsfc£££K«tau 
fcSswftats^^T^cA^awjjtTKa-rse* 

mm LrcBmttmmzm-i^T* ©a*wj ltbk 

-rS^H!l5gi56 2fc, AtC»SW§^iS%A<0 
*MW©fc»©£IM!lP J ^ BE£Q©A©£mK ^crjg 

tfB©^«ww«©E«S3*^ s »sw 

gB6 3 fc» i*K*JW»6 3©»JWOt> fcfc*SJ*Sffl© 
*W»S 3£±$LTXfc!-#5 4 (0 5 ) (c£ 20 

[0 0 3 9] CO^, S^K»aP6 OtfeV^Tti, -7 
^^□*>5 1 (0 5) *»6©«WS 1 BfcStJ* 

**>©■?*!>» WBLfccneniKSi^jf-^D i 

[0 0 4 0] £fcIS#lgB$gl$6 lfi, T>T^n*>5 1 
*»6#*5tl*ai»fll^S 1 Bfc£$ftSA©J*©^i 
WftSfc^ "Segregation of Speakers for Reco 30 

gnition and Speaker Identification (CH2977-7/91/00 
00-0873 SI. 00 1991 IEEE) " fc!a«£ftfc£ffi^fcfij 

[004 1] %LTm%mm$i>6 mw#K.t&* c 

^T©K«l<OAO^»W»ao-r-^i:Ji^Jt«L, * 

ofc**Hitfc«»w«fa^v^n*»H»ioAoe»tt 

jWtf&nfei!gSHHmttftfc:Bfr©H9J? (WT, c 40 

t>-aua^ofc*&Ktt, KW^ii^Kift-r* s i d 
(=-o *wis*i®a5 6 3tciift)-r§j;d(c^^nT 

[0 0 4 2] g£B£#KKtt6 1 tt % »BM«PV6 3V 
«H«4A-e**i:W»fUfci:*fcaBE*HB»W»6 3 to 
64*. & tiSir«^B©IIIji6*^atf *S»7#^KS 

■3v>r, ^-oiB^oAopfoswwwa^ttab, 3& 

4$tfiLfc^gWW®©x-*£#rfc&BW©S I Dk« 50 
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IfiWJTBIW* £©S I D%MfiftiJW96 3 

[0043] %«e#snas6 1 a, >wbih9P86 3 a* 

6©i6^B*inE*B©BB*^fttra7**fcJ&i; 
T> *OA©^oa»flM*«Or-**ifttaW'k:i|jait 
Silill^a-^ *©A©**©S.«Wfta©"r-*fc*© 

[0 0 4 4] ttBW86 2K*^TI4, CCDA^75 

0 (0 5) j^6H6tl«IMfS 1 A«ff«Sffi 

u sbhwis i Afc»"^<B»rtfc**n*A© 
K©»»fl9wa*m3£©«#jai!fc <fc o tttu-rs«M&& 

[0 0 4 5] f-bT0BN06 2«, ilStfmca, c© 

«Wtfc»l»»Ml©7'-** j e©k*Blti/rv>*^ 

T©&ft©A©fM©milW1fi!i©x-* fc*3Wt«U 
*©fc**ttbfc»»«W«W'^"fti3!>»BE»J©A©«© 

ar, *nib3fe«!iwwa6^^-rn©ia35i©A©«©«ii 

5FID (=-i) %J*KiWffli»fcji»j-r5<k9»cft* 

[0046] ztcmmmm 2 it, nemn»s3im 

^4AT-$5 2:¥UK Lfc fc * C ysSftBttfPtf 6 3*6 
*©IBCCD;b*5 5 0*\&©B«fl»S 1 Afcg^K 

H»rtfcd*n*A©«©»»«w«=6*iau, 
WLfc«!i« | ita©7 , -^*s»ffcftHw©F i d tarn 
ttifTtmt&tmc ccDF i D^s«n9as6 3k 

[0 0 4 7] ft«aiKN»6 2(4, *tf£i|fJilgfl6 3*6 

©ataw*wiE**©iite^;fttfi*7#*fcjsi; 

T, A©tt©«»«Wa©7*-**iiiPWKiRJIrt"5i6 
fa^H^, A©«©«aR«W»©T-***©A*:iEL 
<K»T?*SJ:5iriE > rsSriE^S%t>ff^5A3fc 

[0 0 4 8] 3P6jfttf6 4t4, *hb*j»S56 3fr6# 
H**U *»< tT»6nfc*f^OTS3*Xlf-*5 
4 (05) icm&tZ&SKK-ZtlX^So CtHC^O 
coe^rfl^S 3fcS-^<^*xtr-A5 4fr6Htfj 

[0 0 4 9] W^©JffliaP6 3fc*3V>Tl±, 07{C^f«t 
^lc x ffi5?]©A©SBui:, i5#^a56 uWEULT^ 
S^©A©^©^»W^a©7 ? -^fcW*5#^6nfc S 

1 Dk, »IS^6 2^IBHLTV>S j e©A©«©^ffi 
WWIS©f J -^(cWf5^^6nfcF I Dt^waw^T 

isti-rs^^ue 5 (06) %wlt^3 0 
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[0 0 5 0] *LT«e*J»S5 6 3*4, BT£©*fS> 

C©fc*©*©A©JS»*lU£^<e*K»S 
6 0&tf£#ffittffi6 1 O^BIWSaifctfK^OAfcW 
r«aBM6 20BMttj|U:. **';6 5fc:«ttSft 
•fc±a©BBD©A©€tk S I DRO'F I DOlKltttt 
©fiHBi:KS^^T*OA*^fr«ftATa&S*^*W 
»W*J:5KftSnT^*. 10 
[0 0 5 1] *LT»Bi|?J»»6 3tt, tOAWWR* 
ATfcSfcWWUfcfctfctt* B«BMtf6 1&0MR 

*.&c&icj:r>, cft&jEMBMP.B l &0MBMV6 

2 K^©»r$&A©^©g§W&mVgPi©mi^m© 

n££#K3R&6 i st>*iiig^g|56 2*^*1^4*. 

6ft5*©*rSaA©JS©S«W^0f f --* J f>tH©JB 

»W8a©r-*fc»JStttt6nfcS I DMF I D 

*\ ^*^fc*D»6ftfe*©A©*MfcMtf*tf 20 

[0 0 5 2] Sfctte*J»»6 3ti\ *©AjWS*K>A 
« WK Lfc i: * K tt, fcJS U TRgBSWK 6 

i Rt>wisei56 2 (C)0hi¥w»iriE¥8onttf»^« 

^•^.5cii:{Cctt> £«Ktttf 6 l RO'SMSP 6 2 K m 

mmm s c 1 1 «t t> , @#eatffi 6 1 ar/is^gp 

-**i|5^*S$T**©Afc©2^ ; £fi3lfr-&*«fc5 30 
[0 0 5 3] (3).«ffl¥*atteOTS*t£MMV6 

3 cosftwsaa 

*tc, 45j»*S*ffifc:Bt*ifflSWW»6 3 0Ri*W!B: 

[0 0 5 4] #£ftJfPffi6 3tt, nW^tU 5 8 (@ 
5) K«tt«tifc««P^ , ay5/»lcjS-3^T, H8&CF 
H9fc***tmMWl¥*R T 1 tc$-pT$T#l&A© 

[0 0 5 5] ^fe^^«»»6 314, CCD*^5 40 
5 0*»6©WflMB^S 1 AK»tJ€r«B»»6 2#A© 
tt*m*N£c£EJ:&£ttttBNtf6 2^6 F 1 D# 
ttbtiZtZm^'Smm^MRT l ^f7/SP0 
Kfe^THJJfiU t<^f-yys P 1 fcfel>Ts 

i DStfcntitai-r* f i Dfc£w«{*ttfcni« cw 

T% cn«Wif*«*flHil;"»«) tcS-^^T^OF i d 
fr&ftJ&*54M»fctteRT***^?fr f 1 D 

tfmwpfmzmmz r-u -eft^s**) «w»r 

5o 50 
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[0 0 5 6] CCTCOXf-yySP llc^tftg 

J&tttt&nfcF 1 Dtf^OAOfcW-fcBBMsfttT**'; 
6 5Kft)|ft*nTl^H»©ATfeSCfc%3S«fS. 
fcfctC©%&fC*5^T^ ISI^IigP6 2tfSBB©A£ 

[0057] *cT?*fe*j»»6 3 8, xt7/s p i 

A-em£©£^Jr-# D 2 %^^g)56 4 {CjMttif 

tfijTtfcfH i ofc^fckSfc:, roo^A 

[00 5 8] ^V^J*e»J»» 6 3 it. Xf >y 7 S P 3 

■To j * rw*, av^fo j fcv^ofc*ss©B^K 

*LT*fISf|iiJ8Pg|56 3«U WT^tSl£8l56 3fr 
5 frfr S if JSBW18I&##*. 6 ft. * fc£#ffiKft 6 1 

*>6 j t©t*©e#saii8St?*ss i D*t#*&tts 

7f'^S P4fCjIA/C\ g|&ffi»35 6 3^6©^ 
^BmS*K«-3*, ^©AOJSSjWtJfcWfcfc©?* 

[0 0 5 9] lc-ClOXt'^S P 4fc*5^T1t£$g 
Xf-y'/SPi te:fevvr«KMas6 2 
*^#AP>nfcF I Dfc»3*tt5S£tt;fc£iiatf£©A 
©*Mi:-aixT*5t), ftoT*©A»i*tK*J»»6 3 
tf&iis L fc£ HU%*tf 5* AT?* 5 S l^»r^T t § « 

[0 0 6 0] fr<\sT£<D£%ttm%W&6 3lt, Z<D 

Mi^mttm®m&6 3#«5RLfc*ifl**!"rs*A7? 

. *fig»JSPtt6 3«, A«9(cSS#eii»6 1 ^6^*6 n 
fcS I D2A A^S^buA^^^U 6 5t««lSnfcH 

4^., Cttte**LT-&LT^fcV^fcfifnE¥S© 
[00 6 1] * LTWsSifiiJffllgC 6 3 It, C ©mx-f -y 7 

s p 6 tcitA.-eMK.tfia i owj^ic r<%stiv^^m 

T'-Tteo J KZt^itz, *OAfc©Jt»%fi5l**-&S 
fc«)©*l»%-*-&Sfci6©A^J'r-* D 2 ^^^fiSc 
ffi 6 4 JEWMffl L, c ©smio¥SXttWiE^Sfc:+ 

X-r>y^S P 7 tJtAl»K 

IE*B©»7*^*#*fca, Xf-y^S P 2 OtciiA, 
T'^©A(cW-r^«mi^SMS%^7-rSo 



(8) 

13 

[0 0 6 2] P 1 tc*«/^S«tt*«r 

t#s c a, gRBMft 6 2 tc j: bums tut Awmm 

cDAT-&5*\ XttMBKV6 2tfSEftie>A&#rSt£>A 

VTgjSISIlfcfiS E fc(4, flHHKKBiias 6 2 5>-5- 
K.&nfcF I Dft>6tt5R«nfc*W*^©AO*1Wi:- 
®.LT^%^££.*WM?%o ZLX, Ctl^ttKD 

[0 0 6 3] ir£-?tmftffl&6 3lt. Xf77'SPl 10 

^-B-JsJtgp 6 4 KX^J-r-* D 2 £4* S ci i: (c J: 0 , 

M*tf m 1 1 fc^-r«k ? t, r&n, «m*«s.t < *s 

J i^ofc, ^A^M^MttH-fftfetDft&g 

[0 0 6 4] *LT#B«»a'6 3(4, COtXf-^ 
SP9(CitA/e, frfr5«ra(c*ff 5^©A© TOOT' 

miZj&mic&vzmgim&G i<Dm%wmf&% 20 

[0 0 6 5] * bT#!S*iJ«6 3(4, WT^K* 
W6 0fr6eMWWS«ft*4*.6fU IS^tSiigP 6 lfr 

6s i DA^^en^i:, xf-y^s p 1 oicitA-e, 

cn6^KS«ie«&tf S I DMtffcS«JfcH^|g|J6 

[0066] c<ix£<DnM<oBm<om£i. frfrzmm 

(4, W#KHS»6O0**ffiWc«k»)f#&nfc£fflfc* 30 
fS#I£«6 lfr&OS I D4u 0IIgf3sgl5 6 2fr5>CDF 
I BtXZzZ 3 o©BSWS*0^»J*K:«): t)fTt>n«. 
[0 0 6 7] mtf, £i£KK&6 1 *6©S I DRV 

r-.u ^xf7^s p»ci3v^Tg^BaRaP6o 
^eos^Kiwsjiitcis^ffwenfe^oA©*^^ 

^E'J 6 5fC*5^TifcDS I D^F I DfctHStftt6ft 
TV^V^ttt, *OA* , «r*ftA'ea5-*fc ! PJWr , r 
So S^©£0«Xtt2©Pfcfcmoav&^Atf£<*r 
u^ii*t)oTi^i:^5tti»T?feso"e, *o«fc3 40 

[0068] £itttmm&6 3\t, m%mm®6 1 ^ 

I DfttftOBttflte 2A>60F I D*M*U 6 5 

*«WF**lt»'r* r— i j X2bK>, fr-oxrv? 
S P 9fcfc^TSjSBi*»6 0*»6OS^BSWeftKa 

^*fc6nfc*©A0*iw**y 6 5t«Mft«nT* 

V<DZtlfrt%%®mtZ><Dl&&£*)%^CtX&*). £ 50 
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rc^pmmztirzzntfmmztix^K^c t*%xn 
tf, frt*K>&\tmm&*%>^xmm<»Atw&xzz>fr 

hX>tb% 0 

[0 0 6 9] cintcWLTWIS$iJ®a5 6 3 t4, 
»6 lfr6£>S I DRtf&l8flR&6 2fr60>FI DAM 
6 5fc^TPIi;*l9i:Maimt6nT*0, fro 

JSJSfcS^tfc&ftfcf-OAOSfliftW'S I DSD* F 

[0 0 7 0] £fcttf£$OTg|56 3(4, sS#fg§$gl56 1 ft* 
SOS I DRl>*IM12f£gl56 2*^6©F I DAM^U 6 5 

p 9 K43ir^T^aaw»6 o*6©^aawa*K 
a^tfienfc^oAostatfa^s i dx«f i d 

mavAx&ztvmtZo c b#rw*6 
[0071] wis$ira6 3 (4, m%wm>§ \ 

*»6©S 1 D&CFMKMV6 2fr&4)F I DAM^U 6 
<y:/S P 9ic43V^Tfifmam'6 0fr&©SJ»Ki8B* 

fcSt3##e,nfc^-©Ao^M*^ : eu 6 stfcvvt** 

*»SS I DRD'F I DCD^ttllC&mMtttfZnX^fj; 
^mX&^m^cit, *<DA^K»IOA-p*«^Xtt 

e»ffi6 o, mmmm&s \ msmwrnt zo^tn 
frxit£&<DWffitfmm-ox^zc tt>%z.t>tizi)\ 
ccDmvgxitztizwfctzc ttfx%r^\ fctc 

[0 0 7 2] ^UTWsS$iJffllgP6 3(4, CcDJ^MSi 
Saatcit), Xx>y7"SP 1 Ofcfct^T, ^*f»*Atf*r 
JS<0AT-&5t!pi»rLfc^fc(4, Xf'y/SPl He 

356 2(C^^., d©tXf7/S P 1 2 KJiAT?0<i*tf 
Bl l©«fc9tc rs,{4D#-y HT-To <fc5L<*3P^L 

Sfo j xt4 ^oo^^, 4-0(4v^^mT-rfe o j s 
Z(D*<DAt<Dm£zmi\frit%m&*tzrztb<Dtt 
m?-* d 2 ^^^gP6 4 tann-rso 

[0 0 7 3] $fc*tlS$Wa5 6 3(4, C©^Xf--y^S 
P 1 3£jgA,-?gg#Ktt96 1 K*tt5e»fl9Wa©7* 

-^©iR*st;isBiia56 2 ictstf%M<DBmmmo 

£&S££t#3i:Xx'y:/S P 1 2(cIot, ii^Xf 
7 7*S P 1 3(C*3l / ^T#^^t#SST'XT'y7 P S P 
12-SP13-SP1 2(0^-f^*)'Mt o 
[0 0 7 4] ^LT^iS*Jfflia56 3(4, ^T^SISIS 
356 1 \ZftlfZg®mZWL<OT-Z<DtiL&RXfMim& 
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6 2 Km %mmmm®m<o7-z<»mkt>mc-Yft 
micmt5ctic&*)X7--y?s p i 3 tcfc^rBjeis 
»*f#3fc, xf7^sp 1 4tc»A/e, ctizm^m 

^356 1 S«$6 2 fc*i«3 s S©»7fc<fr*^*. 
So CClS*, fi#ffiK»6 llCfe^T, ^©^ffl$& 
i<Df-?^ifefi: S I DfcflfSttttkttTfBttStU 

&F I DfcWJStttt&ftTiatStiS. 
C0 0 7 5]-SrcWK»W»6 3«, COiXf-^S 

p i 5kia,t\ &raiffi6 1 Rummmme 2^e> io 
^ivrna^ss i D&tfF i D*^*6.n*o*»"6 

fcWJiWtT^^U 6 5fc§ShrSo *LT«K»Jflil» 
6 3 CCD^Xx-y^S P 2 otcjSA/e^-oAtctt-f 

[0 0 7 6] cnKJtUTWK*j»ai6 3tt, Xt>^ 

5 P 1 Otcfc^T, frfrSAtfSEfcOA^&S^WBrb 

Xf'yys P 1 6KjtA/r\ fS#BH!g&6 20 
1 RtfliBUffi 6 2 *©R»I©A*IE L < ffi»?3T 

AKftfSTSS I DXtiF 1 Deisms 1 dx«s 1 D 

m®6 1 Xfi8RBi$gB6 2 tc»bTiiiP^SOR»6i(|r^ 
Z5tL, m%ffl6M6 iBLXMUtme 2*<*OgBP© 
A£jEL<BisT*£&frofcli-& (f&fr-£fS#Bf$SI$ 

6 1 *MKN*6 2tf, Hfjtttt*1fHR & L-T^D .6 5 
t«tt*nfc*©H*POAfc»iS^* S I DXtiF I D 30 

Jf£) fcti, ?<Dg#KiRg|:6 lXfc*gBSlR$6 2(C# 

[0 0 7 7] afcWfctt*. ^e»J«paS6 3M\ Xf7^ 
S P 9fcfe^Tf#&ftfcS§#B»»6 l^OSID 

t, mwicmmm^e 2frp>#;te>ftfcF i Dttf*^ 

t-7^S P 9fc:fe^T§J»BHS56 0*»6©@^BBWS 
*K:IStJ*»6nfc*^ J C'©S I D&tfF I DtfBHI 
Wt6n/c^But?fe5CfcK«J;t)X'ry^S P 1 Otcfc 40 
^^©AtfK^AT'&S^iJKLfcttfcti, f£g 

[0 0 7 8] gfcfi«BMaiS6 3(i, ^T7^SP9IC 
fe^Tfc&ftfcB#B«*6 lfr?>©S I D £:> ««](c 
«BR»6 2jb»5#*.6ft;teF I Dfctf^tU 6 5fc*5 

p 9 ic^Tfmawe 0 6©§WB»IS£tcS-3 

t#5nfc£ffitffrfrSS I DXfiF I 
^ae.nfc^ttJT'SSiii:{cJ:OXx>y^ > S P l 0\cis 50 
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^T*OAtfBB0OAT»SfcWKLfcfc#fc»i, «^ 

BMflS6 1 XfiffiBtttf6 2fc»LTia!lO*H©IMWI» 
*6nfc*Mfcnaf*tt&ftTVvtJ:V»F I DXttS 1 D 

£Htfjbrcffi7D©isBfia$6 2X(iK#B^aP6 1 tctr 

[0 0 7 9].*bT*flB»JWS5 6 3tt, COft^fy^ 

S P l 7 (cJtA/P, l 3 K^f <fc a fc> r&fc 

00?A/T'tfe B^ttlLSLfcJ:. 

•Tfeo J > rffiatt*.-^ l>o^V^Lfcott. J & 

£©*©Afc©tt^fcS3iaM*Sfc&©St^£2-eS;fc 

^©X^Jx-* D 2 fcgfS^figgB 6 4 (cMUkSStt) U 

St, Xf7^S P 1 8fc!A/efS#B^g|56 l&tfK 
BMBB6 2fc«LT3Un^BXtin , jE*SO»7i8r^* 

5*.rc'&, xf7^s p 2 otcJtA-e^oAfcj^-r** 

[0080]ffi^ *ff£ffciJSPg|$6 3«\ Xf7ySPl 
0 fcfc^T, frfrS Atf KfcW>AT?&S t fcfr8i©AT* 

P 1 9 KJIA/C, fljjttt* 0 1 4 t^-r £ 5 tc, r&$,^ 

^JUr-* D 2 6 4 t«3WaSWr So 

[00 8 1] * LT fi* ^tsSmUSPSP 6 3 1±. 

7lfrfr*S#tffllt»6 l&t/IBBl&$6 2 £4*1" (f 

BISSI56 lRt>W«6 2(cRfe^) , 3r^«flffl^ 
fflkt&t. P 2 OfCjtA,T*^-©AfcW-rs 

*ltt¥B«iSfc»7T«. 

[0 0 8 2] C©J;9(cbTWiS$fJ©g|56 3tt, b^B 
^6 0, S#BiSSI56 l&c;SiBntf6 2©=&BIS^ 
SfcS^5l/>T. Ai:©W^P J f'^B^gi56 iRXfM 

mm& 6 2 <DW)ttmm*i7 ?ctic&*) y mm* a©* 

[00 8 3] (4) g^B^g|3 6 0St>*IMBI^a56 2© 

[0 0 8 4] (4-1) B^B«a56 0<0ft(tW«J« 
©T'feSo 

[0 0 8 5] C<0^BS«aJ6 OlCfcV^-nis V-f^D 
*>5 1^P)©^<H^S 1 B£AD (Analog Digita 
1) ^SP7 OfcA^-fSo AD^gP7 0tt, 

h57to^t*5l^s i Bttyyfvy 

9. tt7fbL, x^^^^M^T-^S^r-^tcA/ 
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[0 0 8 6] «f««lttai7 Hi, *CfcA7j£n*S/* 

r-^tco^T, »S*:7U-2>oi:te» t?iR(4\ mf 
C C (Mel Frequency Cepstrum Coff icient) ##r£fr 
</\ *O^OlSjMI6n*MFCC*, WflR^h;!/ 

a«b<5*-*) tux, -Tvf-yym 2t*mm 
6izmtit^o &*5, ^m«imgP7 it* 

ax^c^ wwt* »r*©Ji*»^fc©/<7- io 

[0 0 8 7] 7yf->^»7 2B, «P««lHja5 7 1*^6 
©1ffK^ Wl/£ffl^T, «WE7*;I/K1t»7 3, ft* 
E1SS7 4RtfXffiE«»7 5*<fi«K:i&i;T#HL4 

II^HMM (Hidden Markov Model) 

[008 8] f )S:fe^ff€f MEtt& 7 3 (4, g^K 
»1-SS*©§ISK:fctf;6ffl4©3JR J *», ^g&, ^St& 20 

(PJ*.(4, HMM©ffi, DP (Dynamic Programing) V 

Tt/^o 4*5, CC-ettjttt#*HMMffifcS"3^T£ 

HMM (Hidden Markov Model) tfffll/^n^o 
[0 0 8 9] ^*8B««7 4 (4, BNjWkOft^ttcTk 

#H»*%Ka«LT^5. 3D 
[0090] CCT, 016 ti. S¥«I3tftgP 7 4 (ClBtt 

[0 0 9 1] 01 6(C^t<fc?(C, ¥i§S*#(<:*5V>T 
(4, #H©MtB b *OSf «£W ttlSfttt 5 tiTfe 

tit^s, @i6©#iftm looiyhu (0 
i6©ifx)*\ i o©^5x^(cis^-r?>o 

[0 0 9 2] 4*3, 01 6K*5l,vT\ MtfJbteP-T^ 

^T'SLTfe^o fcfcb, e«^5»Jfcfe»SrNj(4, m 40 
* f A/J * fc, 0 1 6 T*(4, l o©x y h- >J (ci 

•o©gg|I?»I2j£ b T £ 3 , locxyhU (c (4}g 

[0 0 9 3] BHfcRD* I EtSSP 2 614, SHNBtt 

[0 0 9 4] CCT*, 01 7(4, 35Cj£KttgP 7 5 (C|B« 
SttfciSaiNfcjKLTV^. 4*3, 01 7<D£i£StiiJ 
{4, E B N F (Extended Backus Naur Form) T*!BiE2 50 
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[0 0 9 5] 01 7(C*5^T(4, fTS*^«SltcStl* 

ttib (01 6fc^brca-v^{<:«k5Mmb) £Sf 0 
££(c [] TH*nrc^4«BSRl^T*3&SCi:^ 
b, NJ(4, *©jffl«fcEBSftfcJtrtlL©#g§ (&5 

[0 0 9 6] ftoT, 01 7(C*5^T, {?IR(4\ miff 
Gkfr&lfrB) ©&i4#lfWf$col = [Konol sono] ir 
o wa;j(4, g&Scoltf, rcWS (fe) (4j3;fc(4 

r*©^3 (fe) (4j k^?#f§9JT*&3ci:£g-t 0 

[0 0 9 7] 4*3, 0 1 7(C^bfc£i£*ifW(C*5V*-T 
14, «ft$silfc$garbage*^a«tlTl/^V^ $& 
$sil(4, ft&QSVer'l' (MtfW £^b, S 
&$garbage(4, g#($k:(4, ^31£-5 b©RrC-©g|tl 
4aB*«F»>J Lfctf-^^T^l/SrStf o 

[0 0 9 8] l?r>*01 5{CM0, V-y 7 2(4, 

mew 7 4 <oJnag»*&#8«- § c t tc * t> , #w 

*7%K««7 3 KEItSnTV^Witr^SgK-r 
§o ^(CV'y^y^gP? 2(4, jgtO^O^ffitT*^* 

*iHE«» 7 5 kk*s nfcx'fflsaijsi^is-r * c k 
4t3-6v-yf->ya$7 2(4, ^mamgP7 ift^-rs 

fe*v>»»lt7 s ;H!)*W**fflL, ^©*f§^r;l/©^ 
[0 0 9 9] <tO*ftW(C(4, 7»yf>y*7 2l4, S 

issnfc^Rt-r >nc»is-r soffit <t oScM br^© 

;KcS^, Jli!^HMM^(c<};oT, v-Y^D,i-,y 

0i57 2(4, ittm»tBSp7 i&mti-tsm&moim'Vr 
5>j©HtHb^^ss^i;bTm^-r5. 

[0 100] «fc0*i*W(C(4, -?>y?V?&7 2(4, S 

viwmmm mtmm *^*b, ^©^«ffl 

[oioi] ^©jc^tcbTa^i^tisv-r^a^y 
5 1 Kj&znitmrtwmms.. x^Jt-^d i t 
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[0102] ccT-01 7 <Dmm<DBmT*it, m9fi 

■?) f $ pat 1 = Scolorl Sgarbage $color2 ; J 
tl/c^fc It, m&.$ garbages ttJfcT 3 ^EP«g£* 

m $ garbage^g-T tf-^^r ;KC fctt § ^iR<D»^ 
kLT©^gt^J%*atlf§©^ii^JkLT$|JBT 10 

iMco^EFitgn^j^*e^iHEr^agi5 7 6t« 

[0 10 3] 4*±i£o*«EfiaifflJiBijr$patl=$col 
orl Sgarbage $color2 ; J tc Jctlfcf, ^£fc#colorlT*g 

t. IEa$color2T?a«n*#«B»*KSS?nT^S 
oTS>5IffiW?fc£o 

[0104] *a^^EMasi57 6«, *$aaaa5 7 
1 fr&fluesnswaK* h/i^yy (#a^ h;wr 
m> fc-i&Ettrs. se>£, *a®i§EP^agP7 6 

tu-rs, *LT*a»igEifflj!iffla57 6tt, -?v=f-y? 30 

«7 2fr60««fc?U (*SS») ta x-^7ft I D 
(identification) fctfU *3B»B©WBfi|lJ » * 

^h;!//^y77 7 7{C^-r5o 
[0 10 5] Wa^* h;W^77 7 7«u 0 

1 8 *a»SKH«yi»7 6^6ttis« 

[0 10 6] CCT'Sl 8fc:J3V>Ttt, L 
T 1 SOS'-^VS'-Wl/ftaWrtf 1 D k LTttSftT 40 
V>5 0 fct> 0iJ;UiV3U 1$a^b;WVy7r7 7 
K*V^T, Nffl<£>*a^I§<D I D> SfM^jatfWa-S 

357 2««*a»H©*?»Klffli:a«*?>J%ttW'r*i:, 
*^ffiEH«yi*7 6T«, *©*BaiiKttLTN 

tt, 01 8fcjS»-e^-r«l:9K, ^<D*atSf§©l D, 

[0 1 0 7] fftfBl 5££g!)« ^5X^U>^7 8 
ti, #a^* h;w<y7r 7 7K*rfcfcEtt3ft;fc*S 50 



&m 2003-25598 9 
20 

Um (WT, ait, fr*S»Ki:^5) fco^T^a 
^2 h;WW77 7 7KJ8teE1tSttT^Sfl!l©*3a 

"r*X37*fUW3. 

[0 10 8] *ftfr-&*7**U:'y»7 8fi> fr*a 
X*'J>^97 8tt, »«^h/l//W7 7 7 7*#i 

t 3 c k T*#r*a§si§<^a^ * h wmiewmt z> t 
1 1. k, KttBit*a^o^ii^5>jt tfttfo r-awe 

[0 1 0 9] ft*, WE-rMi, «Ef;MEIi»7 

[oiio] *7**y>^gP7 sn, mfflcL?. & 
mm^mmic-D^x, *r*ss»»«:jtr4X3 7fc 

ttiU ; £©X37fc:<fcoTXn7^-r-EtlSl57 9fc 
EtlSnfc^37'>-h%H«r-rs. 
[oiii] ?p,{c^7X^uy^g|57 8tt, Mfrbfc 

zmmmm (jke«*bsk) ztvxzvyfLrct 

7X^OW^, Sr*a@^frfeft^>^i:LTJiP^ 

3*5x*.%*ftHi-rso *6£i'7^';yjf»7 8 

ML, *©«»Jtl*tlSrJV^T, X37S/-HB1i»7 
9 Kt2tl£ tlT V ^ 5 7. n 7 h % . 

[0 1 12] ^3 7^-hE«»7 9t4, fr^SfRRic 
■Dl^T©BtEtt*a»l8ti:»'rS^37^ KEtS*a 
Sffifco^T<Dff*BS»fcaf3X37«tfS»£tt 
fcXn7^-h^Eit-r§o 

[0 113] C CT*, B 1 9 It. Xn7->- h^LT 

[0 1 14] X37->-Mi, *a^l§ofi Dj, r# 
IDI^iJj, '^7X^ty/^ fW>/5i Dj&tfU 

n7j tfExE^nfcxy h u T*ia?ns. 
[0115] *a^i§« r i dj t r^gi^^jj k LTti, 
1$a^ h;W^77 7 7fcE1i$n/£feOi:l^-cDt 
cD*^^X^Uy^"g|57 8tc£oTaii£ft3 0 
x?+y/^j«, ^coxyh 'Jo*aiilg*><>^kft 

V^»7 8C«feoT#«n» X37S/-KKSESStl 
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ffi©*a§ii§^nemctttsx37T"&?K ±&Ltc 

[0 116] mtf, <^£> «f«^h;WVy7r 7 7 
fc*V*T» NfflO*gtif§0 1 D, SSU&JU&tfftaK 
> MWRWaWEII«nTlf^*fct*l:, an 75/- He 
^ONfficDJkStSfgOI D, *M^K ^7X^t 
fWt^/M DRtf^aT^aSJftT^S. 10 
[0 117] *Ln$fiK* 7 7tC. if* 

mamztizt, ?7xzvy<!?®7 st-mu t>37 

[0 1 1 8] Tttfe'&XaTS'-Mctt, £r*S®f§© 

37 (0 1 9£*5tf?>X37 s (N+ 1 , 1) , s 
(2, N+l) , -s (N+K N) tffihnZtl&o 
fcX37i/-HC«:> KBIfcfcSSS^ft^ntco^T 20 
<D«r*aSllti:*trs^=»7 (0 19{cfett?.s (N + 
1. 1) , s (2, N+l) , -s (N+K N) ) tm 

K=max „ { Es(k 3 , k) ) 
[o i 2 4] -e^nsfflk (6k) £ l D£?2>*y 

'WKmty^tznzc fct^So 

[0 12 5] fcfcU (1) Sfcfcl^T, raax k {} 30 
H\ OrtoM&^fc^SkfciSiilrrs. $fck 3 lis 

So $6>t, ztt, k 3 Zt^xiUzmtZty^t^ 

[0126] &&±)&<D&oicftm*y'^mfc?%m 

^fcti, ^©2o©*g»fgo5^oi,vfn£tt^:/ 
[0127] zrcttmty/wffizjjmt, ±&Ltz& 
<D*y^tr^x^%im.®m<D*)%, mo^mmm^ 

lc?zt>(Dmz ; t<Dtr ; 77>5<DKm*y;%t??>ct$> 
[0 12 8] tt±OJ:dJc«l<«*nsffjBBMII»6 0T* 

v^^Dt>v5 1 fcxasnfcepf*K»rs*F* so 
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[0 119] &*>\ 01 9<OHSS<D^(C*3l/^T(i, 1 

d# i omwrnm (©ass) mo^r©, i d# j ©* 

(©^SUS^D £Jtr5X37fc, s (i> j) 
[0 12 0] $feXa7->-h (01 9) fCl±. I Dtf 

i msmm to^ro, i d*< i ©*gg& 

ffi (©WIR^J) lc*W*^37s (K j) fcSBS 
ft*. fcfcU C(DX37s (K j) «, v-y^y^" 

[0 12 1] St>*01 5t^0> ^y-r-fyxgl580 
is- h fciS^T, S¥»ffB1SgP 7 4 fcEttSnfcJHiS* 
[0 12 2] CCT\ ^9X^cD«g^yA(i, ^<0«t 

5tj*^«nSo t&t>%. mm* ? ; 7?\$(D*y/^ 

[0 12 3] 

[ttl] 

(1 ) 

[0 12 9] £«Lt ^SSHR«6 OT'ti, A*<»Bfc 

5 l^e>AD^gi57 0*^LTSJ»f-*fc«nT« 

2^f77'S P 3 OtCtJV^TF^^n^o ~" ' 
[0 13 0] fLTi<Xf7yS P 3 1 Cfc^T, # 

[0 13 1] V'yf->^7 6fi» m<7.7yyS3 2 

OV^T, ±3iLi'cj;9fCXa7ttg[^TV\ CCfXf 
7 7"S 3 3fcfct>T, Xa7tHCSS#5n5X37 

[0 13 2] *6fcVyf-y^»7 2tt, ^<7,x-y^ 
S 3 4 tCfe^T, a-+fO^(c*^gS|g^$SnT^ 

[0 13 3] CCT\ L©Xf77*S 3 4 (C&I^T, 3 



(13) 
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t%&%±i&<D&&mmmmi f $ pan = $ coiori 

$ garbage $color2 I J 3W8ffl?n"f (C^SBWS*^ 
[0 13 4] utUCttLTXr-V/S 3 4fCfcV^T, a 



f $ patl = Scolorl $garb 



age $color2 : J *W*ftTS*BWS*jWI&*lfc 
V^^2 3tt, a<Xf.7^S3 5lCl3^ 
T> ^aSBfflJSBIJoaEKSgarbagpfcWiS-ra^K 10 

g$ $ garbagetf^-f JS-^i? * r ;UC *3 It 5 €91<DS 

u *«>5icWiwo«^K»t«^jft5iaMWKia 

4B8»7 6k:fltt&LT\ JMftUFTf* Uf77SP 
3 6) „ 

[0 13 5] -7?, *a^l§iSM5aSg|57 6tt, 

m»7. i *^«*e?nsifa^^ wuj&9J*-is321sl 

7 6tt> Wf-yVUl 2*»60*e»BB 
777 7 7-Efltlfrr*. 

[0 1 3 6] JW±OJ:dfcUT, #^h;W^v77 
7 7 {C^fc**g®l§ (fr*g®i§) ©ID, 3ffi£?iJ 

■oisira 2 1 K^-r*e»as«ia¥)SR t 3 
■zftt>nZo 30 
[0137] -rato-sswsiweoicfev^Ttt, ±$ 

©£3fc1^^*h;W^77 7 7tefrfcfc*»»S 

•<*?*»»») ©id, tfiu&j&cmft'** hsisftai 
jWE««n* t c©*g®i§5!ia^iiiR t 3 *uf7 y 
s'P4 0Kfe^TSJ&sn, ^•fmwicxfyfs 4 1 

{C«I/>T> *9X*y>:?gB7 8*\ 
77 7 7*»6«5tcS»KO I DkM0R5«J*tt*lfl-r. 
[0 13 8] ^Wf-yyS 4 2 (CfcVT, ^X£ 
yy7»7 8*\ Xn7^-hlB1ta53 0©XnT^- 

[0 13 9] ^LTC©7.x>y7S 4 2(c)3V'T* "TT- 

37->-h tcgjiBtt*st$ig©xy h y l 
asig{cH-r5ifa!t%x37~>-hgBiigi57 90x37 
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[0 14 0] tftfc%i'7Xinj>'^7 8fi, 
^h;l//^«y77 7 7^6tt*WLfc*f*SSSI<DI D*5 
itf^SHJRW&XarS'-r* (01 9) (ce®1-5o $ 
£{c^X2y >?"g|5 7 8H\ a--^^^-5X^^y 
/**£lSU «r*S^ii©^5X^^^AtLTXn7 

s^-ncaar*. 'Sfc^7x*yv^7 8«\ #r* 

gltlg© I D%^©8r*lg^©«afy^ I D t L 

[0 14 1] &*5, ^Zrom-et, fr*^l§i:©xn7 
%fW-r*HE1t*»»Jg*^aLfti/^ii), X37© 

[0 14 2] *\fr5Xf->y7S 4 3©$aS&«U Xf7 
7S5 2fciS*, ^>r^->XgP8 0(i, Xf7 7S4 
3T*Mfr^n/cXa7v'~KcS^^T> ff*Ett»7 

5 4) o _ 
[0 14 3] ?1*t>*>, ^£©*Il^ ift4^7X?tf 
4^tlT^5©T% ^Vr* >Xg&3 Hi, X37~> 
-KcfeirS^^x^-f y/^#B8U ^©fr/cfc£/£ 

^oi'^x^twtsnxi/ h'j*»«E«ai57 4 

[0 14 4] -73, Xf77*S 4 2lCfeV^T, f T'fC* 
a66tiTV5^^X^^#a-T5i:¥iJ^$nfc%&, f 

X37->-h (019) tc, Raeit*ssaB©x> h y 

«t) *^fiEfS*&, XT77S4 4 Kit*, ^X 
*yy?f»7 8tt, *r*SSHSf«:o^T\ &BEKtt*S 

7£fr*T3o 

[0 14 5] mtf, <^ I Dtf lftSN 

fiOKEttHcSSBatfffcU #r*gti!g© I D£N+ 

i£f£i:, *5X*y>^»7 8T'ti> 0l9fefev> 
T&MT*7n L fcgi5^©fr*§gSI§fc OV >T © N <@©B!EIB 

tt*sai§ ; en ; fn(cWTsxn7s (n+k d> 

s (N+K 2) s (N, N+l) Nffl©KIBtt 

*eais^n ; pn(co^T©«r*stsi§{c«-r§xn7 

s (U N+l) , s (2, N+l) -\ s (N, N+ 

i) *w?ft5o ftfe*7x*yy^a57 8tc*3^ 

T, cne©XrJ7^ttS-T?>tcfefcoT(i, ST*^iS 

5 h>l"^y7 7 2 8 %&m?% C fcT?B»S*i3. 
[0 1 4 6] ZLX?77,$Vy<7®7 8tt, ItStfc 
xn7^fT*Sa^©I DRlfeWSWfcfcfcfcxaZ 
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S/-h (019) (cilSoU Xf'y^S4 5tCjtt?o 
[0 14 7] Xf-y^S 4 5T*14, ^5X* 'J y?"gl57 
8ttW7->-h (01 9) *#B|-rSCfctJ:»), #T 
■*S»aBfcO^TOX37s (N+l, i) (1 = 1, 

2, ■■■> n) (*so t*w&*y**m 

14, xn7~>-hcD^^>/U D^:#Bg-f5iii;tcJ; 
££(CX37:y-b©X:37£#!$-f3Ci:T\ &r*§ 

7 8(4, ^CD^t«L^g^y^i:LT£Dg!lI2ti*ga 

[0 14 8] Xf-y^S 4 6K3i#, 

•jy^»2 9(4, *r*SS»*;*7"y:/*S 4 5^«at 

/vcmxzo •ntt>-%t7x*yy9&7 xn7 



til * 7 x * <Dftg * w * -5 x £ -f y f * iitr. 

[0 14 9] ^11^7X^^^7 8(1 XT V 7*20 

D(kl, k2)=maxval k { abs(log(s(k, kl))-log(s(k, k2))) } 
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* S 4 7£*5^T, ^Ili^X^m^OcD^X* 

JUtt. Xx-y^S 4 ^X*yy?"g|$7 8(4, 

Xfy/S 4 7(D^7X^^ii|Saa(CckoT, «HJ*5 
X * * 2 O <D * 7 X # (C f 5 C t & X # if 9 

S4 9fcil€r«, Xr7yS49m *7X#yy^gi$ 
7 8(i, ^tB^7^x^(D^fiJ{c t tt)WP>n^2oxa^v 

X* (C(D20(D77X^^ fcTR jffii. SlWfi' 
10 7X^i:IB2© : F^'7X^tV>-5) m±<DfS<D^yX^ 

[0150] cct\ m\Rvm2<D??7xzm±m 

[0 15 1] tm^l©^7X^^2 0f^7 
**©fS#©ffit©*S"< ©I D%, k? 

g-f^kic mi tfH2co^yx^(Dftm^y^ 
(*a«ffi) © i d*, ^ti^ftk i Sfc(4k 2T-S•f 
;:^:i:-fSi:, 
[0 15 2] 
[«2] 

(2) 



[0 1 5 3] T*^£ft3ffiD (k 1 , k2)*m\tm 

[0 1 5 4] fc£U (2) Stefe^T* abs 0 (4, 
0 F*3<Dffic9l6ttffi£3fto Site, maxvah {}(4, k£ 

gardens tfciog 

[0 15 5] 1^$, \DWl<D*y/*&*>'i#l£$t 
tCttt&t, (2) SfcfeltSX3T©iMS[l/s 

(k, k 1) 14, kt^g^O/^k 1 hcDSggl 

(cffl^U Xn70iS*^l/s (k, k 2) (4, t»< # 

(2) 5£(cJ:ft(4\ mi i:S2©f^7X^^O 
|gi<D T ^x£cDftgy<yM# k l £©8681 
S^cd^^x*©^^^* k 2£<oMc9fi* 

sgsttsftscitic&So 

[0 15 6] &*5*5X*F.gS§gt(4, ±iELfcfeWC|® 
££ft£fc©T*(4&<, ^<Dti!!, flfctf, ^1W7 
^2©?^7X^(D^^t 

[0 15 7] Xfy/S 4 9<D&L3ftfi, Xx-y7S 5 
0(Cjt#, *7X*yy4f»7 8(4, SBlfcS2 0^* 
5X*H±©*7X*j§iaiE8ta<, m;£0BHIf £0* 

f 5o 

[0 15 8] Xf-y^S 5 0(C*5^T, ^^X^fflSgH 



%imt?xz<D*yrtLLT<r>mkc>&wm.i>\ * 

(D^gfl^Sfr^oT, 20<D<77X?(C^X*'J 
:/^^#fc<Z>-e&5i:#*.6ft31fr&, Xf-y7*S5 
lfcJt#, ^7X^Uy^»7 8ti, m\tm2<D=f^ 
5X^^Xr3T^-MB1ig|57 9<DX37~>- btegiS 

30 [o i 5 9] ttettt^xzvyf&i 8(4, mi t 
t, &ta? : 7XZ(D*y^<D : j%, li<D?^7X?i: 

^7 7 X £ 'J > ^7* $ ft t CD <D * 7 X ^ T > / IB 1 <D=jr 
irvXZViryXZTyyctZtPilc, H2cd : ?^7 
X^tc^^X^Uy^^ftfcfecDcD^^X^Ty^^m 
2W^7X^C0^7X^x>'^(C-r§J:^(C, X37 

[0 16 0] «6(:^7X^'Jyy'a57 8(4, SlW 

^^x^fc^^x^uy^ft/cyO/^co^g^y^i 
40 D^mito^^^X^cDf^gyOAOl D(c-T5i:« 
^2©?<75x*ic7^x?u y^ftfc^y/scD 

f^g^O^I D^2©?77X?Oft^^^I D 
[0161] 4*5, m 1 i:^2W^7X?(D9^f 

ft^-73fc{4, mtist^xzoyt rx^i-y/^m*)^ 

[0 16 2] tr^XZOysfm 8&&±<D*.5KLX 

m\ £m2<D???x*zx=i7i/-vic&mt*%~k. 

XT7^S 5 1^6S 5 2(Cji^, ^yrx^XgP8 0 
50 , X3 7'>- h tcg-^T, ffffClgflS 7 4 CD#|§S? 



(15) 

27 

SF&KRU ffla%»7-r» Ufy^S P 5 4) o 
[0 16 3] t&t>*>. V^*©ifr&, *tt*5^*3W8 

8 o a:, * -r wswrt *»j &m * 5 x * tMist 

&xyhU%H!Mrr«. s&c^y-ff- >x35 8 0(4, 

m i tS2©f^7x^^nfntMt§ 2ooi> 

[0 1 6 4] — ^T'y^S 4 8(C:fcl/->T, Xf'^ 

S 4 7©*7**#»J*yifc:«koT, &hj^x*£2 

XaX-f-yT'S 5 0(C*5^T, 351 £ IS 2©^ 
^7X^7X^ HEfttfffi£<DNtt £ «fc 0 *T'& V V 

t cD«a»(D*aaft oe» wwsw sb 1 1 ai 2 1 5 

*&) , 5 3(Cit#, *?X*»V9in 8 20 

(4, &dJ77X*CD#rfc&ttS*^£*#>, X37i/ 

[0 16 5] f4t)^^7X^Uy^lB7 8(4, fr*S 

W&t* wit Lxhuz.rc®m?7 7>*z<D&* yAic-o 

SCtlctt), (l) SOW-WKiMB&^aTs 
(k 3 , k) ■ZmW.tZx, ZZlZ. 77X^'J^*7 8 
(4, ^(DfOsLfcT^Ts (k 3 . k) (1) 

y/^OI D%*i6So J fLti'7X^i;y^7 8(4, 30 
Xn7~>-h (01 9) (cfcO-S&W 7 ^X ^ 
AOftg^y^ I D£, ^aj^X^Ofrfc&f^;^ 
M<DI D{c#t$7t£o 
[0 16 6]^^, Xx-y7S 5 2 (Cjt#, ^>-ft 
^Xg|$8 Otf, Xa7'>-Hc»^VT»«Ctl*7 4 

4) o 

[0 16 7] -r^^%, V^O%&, ^>x7">7>gl58 
0«, X37->-h^#mcttc4^ &HJ7VX 

©eiBSJiJfcBSWSo ^LT^y-r7"yXgi58 014, 
J£fg#»(c*5(t3f$tB^ 7X*(c*tiS-f 3xy h 'J <Dg 

[0 16 8] CCX\ 02 1 OXf'y^S P 4 7©^7 
X*#fJfflJItt» 02 2tc^T^X*#SiMS¥llIR 
T 4tc^oTtTt>ns 0 

[0 16 9] -r*te^«^K««yifB6 0T*(4, 02 2 
©Xr-y^S P 4 6^6Xf77*S P 4 7 (cilty^iKD 
^X*#fiJS[lJI#l®R T4^XT7 7*S P 6 OiOo^ 50 
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£-f«#J(cXT<y:/S 6 1 (cfci^T, ^7X 

*i> y^7 8A\ i*igw^y/^ i/ci)n*.6ft 
t, Jig, $ i <o«fiai^v;^g2o«i«a^>/^ 

l^7o • 

[0 17 0] *LX\ i<Xf'>7*S 6 2(C*5^T, * 
7^^ U V ^g|57 8 (4, ^1 ©<Ett£* » S 2 CD 

fif^i^ y^znztiiiM* y^ttzcttfxzz 

[0 17 1] CCT% SlXtt^ 
S^/Sfc-fSCfctfT'tSfrif^frtt (1) i&DttJ? 

£ff 9 ftS & 3 , c ©I (c ffl v n S x nr-s 
(k* , k) (4, X37i/-hZ&mrZ££X'$mZ 

ft5o 

[0 17 2] 7,7- V 7 S 6 2(C*5^T, 3? 1 ©{gf^g/ 
[0 17 3] X7'>7*S 6 2(C*5^T, % 1 (Dig 

f^y/st, ^2cD{g^^y^% ; en ; fn^p<y 
2 ocd * 7 x * fc#wr 5 c t wx* 1 5 i: mfeznrcm 

Xf77*S 6 3(Cj1*, ^X*yy^g!$7 8(4, 

mi©{gft«^>^fc, S2©iRt^>«y/W*hfn 
f^a^y^i:^5J;7(c, ^W77X^co><y/^2o 

CD7^X^(C^SJU, ^©»»J»0 2O0^5^^©e 

5X#<D<£*f (J-XT, iSg, <iffi^vX^©fflfcV>7) 
tLT, Xfy7*S6 4(Cit& 0 
[0 17 4] Xfy/S 6 4T*(4, ^7^X^Uy7gP7 
8(i, ^(H^^X^cD^y/^ffT*, ^/c^li:m2tD 
^tS^ y^CDfflt LTSS? LTV^t> 2 ~D<D* >^<D 
i»y*«A»if5M!pKU Xr 
•yyS6 HCMO, fc»2©<S«a^>^Offl 
i: L TS^ t T V ^ffl 5 5 X ^ <D 2 OO P< ^<Dffl 

[0 17 5] SfcXx-y^S 6 4(C*5V^T, ^ 1 i:^2 

cffitfM * y ^<om t l x m$i u t v ^ ^ ^ w * 5 x ^ 

O 2 o«D^< y/sofitf fc^fc W£Sftfc*&, Xr >y ^ 
S6 5(Cj1^, ^7X^U>^a57 8(4, iitf^^X^ 

[0 17 6] Xr-y^S 6 5(C*5V>T, ^S^^X^cD 
$tf#fcba^fcW£-Sttfc*&, Xx>y^S6 6?rX 
+ >yytT, U^-y-T?.o C(Dli-&(4, 02 icoxf 
-y^S 4 8(c*3^T, ^W^^X^^WrSCtA'iT' 
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[0 17 7] Xf77S 6 5Kfe^T % <g*i*7 

X*©ffi##£t3£W££tlft*il^ Xf77*S6 6 

*7**yy?«7 8ti* fc»*7x*©e# 

7X*H±0H®*7:**|llfl0**#fcSo ^LT, * 

^©^^X^kSn^o 
[0 17 8] 02 1 ©Xr^S 4 8lc:fc 

[0 1 7 9] W±©*5te* *5X*U>^aJ7 8tC*5 

ft*7X*©4>*^ *r*SSIS**rfcfc*y/^LT 

%*©ttfcB*7X*©*rftft*yv<fcbT, EttiJ^X 20 
* £^©1$m * 5 x # © ^ y/ SES^Ttffflf 3 <fc 9 
KLft©-?, *efifB**©g«&1#*jWSffiLT^3 

[0 18 0] 2e>ic*y-r7"yxgfl8 oicfc^T, 

*s^©JMs»«^©as%i&stefT5 c t^-c-t 
[0181] sfc, mm. (gtc, wf-vtrm? zic 

*-©J:9fc*agSIS«, &tiJ*7X*©#*iJK<fcoT, 

^febi jwe l < ttw* nfc*a»ai t aguo * 5 x 2 

fc?5**yy?SftS. f LT> C©J:9&^X* 

*\ c©xyhy©S»369JttiEb<*aJSttftav3fc 

fcfcvvr*#fcxa7*4*.3ci:ttai#\ for, <5 

Kytt*©&©SFa^fcttHfcA,2ie*Uftv\, 
[0 18 2] CCX\ 0 2 3tt» *S»lg©SSB*fTo 40 

Tf#p>nfc^vx^yyy^^LTi/^„ 0 

2 3 IC&^Tti, &x>hy AMO(D^7X^ 
%S0t^S o *ft, 02 3©£Wi, #^X£©ft 
m*y^ (*SSIS) ©*«S5U*&LT*J»), 02 3 

[0 18 3] -f&fc-£0 2 SKfeWT, 0fl*.fcfS 1 fir© 

xy h y WSJ © 1 o©58fSfttttf * y/< 

gga^J«> fdoroatj (Ka7-) K*?T^4, $ 50 
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ft, mtf»2fr©x^hytt > *a§if§rMSj©30 
Ofta^>/^OSIR5l|tt, ^kuroj (*P) icftsx-TV^ 

So 

[0 18 4] 0Ut£S?mT©xy h-y^ *a 

»Rf#]©40©«Btf*y^fcfcoTV>S*5*** 
SLT&D, ^©ttS* y^©^H«i, fNhoNde:s 

uj (y^y-f-x) cftoTv^. *ft, .flfHAii*8fT 
©xy h y *9JNH f * U y s>J © i oco^lS fc » ■ * 
a«*ag r#j © 1 9<DJiB^y/^6oT^8*77^ 
^atTfey, *©tt«*>v<©*ltol»ljlfc fohoNj 
tefcoT^So fl&©xyhU felloe fc* 

[0 1 8 5] 02 3t«fcntf, |I|-©*a^f§©5gfSlC 

So 

[0 18 6] ftfc, 0 2 3©3?8ft©xy h- yicfcl^T 

©i9ow, r— of5x*fc*9^*u y^n 

Ti/^o ^©^X^&^^y^&oTl^SIISfr 
6, *a»fgr#j©*5X*£&£^tT'&&k#7tP> 

*.mmmi*uyi?io>5mi>. *©*7X*© 

•f * * 5 x * t , *a§si§ r* u om&itti** y 
^k-rs^7X*jc*5x*y y^*tisk#*6n 

[0 18 7] (4-2) «K»a56 20*f*«l«|ja 
«BM»6 2Oj|{4:(»ttAlCO^TB0»r«. 
[0 18 8] 02 4Rt>"02 SfC^-r.t^tC, fflBSttffi 

6 2«, iiwjcaftr s«»Tt- £BMHAiciSflpr« 

Z.£ftT*%Z>t>\ CCD*^750 (05) 

nsB#ffi*f s i Ats-^<ia#^p)M^^-y^tt 
wr*««i'a«ia»9 0 4:, «nn*nft«^*-y*is 
icff*BOT«nRM»itf 9 i *^«?n5. *n 

JR7 • 7^;^^^ (Gabor Filtering) J^lSfflU 

sft, «^^-y*»6«*««"rs«iB««iaKttf* 

h • ' v-y— y (Support Vector Machine : 

SVM) J^fflLT^So 

[0 18 9] C£>S^fB8fl6 2fi, SM^^-y^lSlg^ 

[0 19 0] 02 4tcfi, Higg5g|56 2<D^Wm<Dm 
it*. Sft02 5tc{i, ®|gg|Si56 2<0KMgtR|©«lS 

[0 19 1] *aSPBK«V^Ttt, 02 4(CSti9 
CCDA^750 (05) fr6A*«nfta- If© 
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wmm:iiX7 • 7-{)if<x*%:z>mmmtm®9 or- * u y^mmt, mmmmm®9 1 fcfcnt?.^ 
\Lrz&mm#~ h • • ^-y-yx^m - h • • v^-mco^t, ^n^np»c^ 
1359 1 fc&A£n3 0 gsisis®aa59 1 t*«, -r^o 

K^hm&Zft%¥lim<nT-Z?tzt>%%M : r-5 [0194] (4-2-1) ##7 • 

[o i 92] ztc, mmmmas^xit. 02 stc^-r AF^^Ufflfls^fi, &3#£©£{uk#ltiiiri4£ 

Lrcm%tfmmffi,mm®9 1 usA^n^o nigse&a Mg$£ns 0 ##7 • 7^w;y^ cn^ig 
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41 

[0 2 8 8] ■gttf&fHc&tM* h^@tC*3V> 

x^m 1 1 zvnwvz anttAmfr e> ®.m? % *tis¥S 



(22) ftm 2003-255989 

42 

* [08] ^mi^5fta¥l®^-r7n-^^-hT»* 

So 

[09] ^mi^Jaa?)i^-r7D-^+-h^ 

[01 0] ^M¥SSQSi^fc^lt?.WIg^J^-riilSS0 

[sin %M¥^mmmicm%ttmm*7K?®Mm 

[012] F I DRXSS I Dt^m(i:©fTM^<D^ 



%mmm<Dwm%. Rmm^mmzmzmm 
s frgfr* mt % ym^wt t , ww^ t * s 

[0 2 8 9] 
[0BB©fg¥*iHW] 

[0 1 ] *%M<DmmK£z>n#v h<Dftmm&*7n? 
[02] xnMmmic «o*fh owBWuafcjs-r 
[03] xnmmmic & s p # «y h cd^ii^cd^ 30 

tC<ttT3B§il0T ; &3o 

[04] *^5SOJf^{cJ;§P^'y h^^gP^OlKW 
^•T5BS^0T^S„ 

[05] *mm<DBmic£5u$v b<D^m^.(om.m 
[06] ^m^mmicmt^^^yum^A ovmm 

[07] ytUfCfctfSF l DSD'S I Dt&mtOlM 



[07] 



FID SID 



63* 



1 


2 




2 


5 





* 

[01 0] 



is i o nmm ( 1 ) 



[013] %M¥mmicm%ttmiZ7Kt®mmx& 
[014] «Hu^s5!ia«ftc*3tt5*tis«?ij^-riis^0 

[015] ^KIilSgi5c7)«^^^P-y^0T'fe5o 
[016] #|g^»(D^(t«-r§1i^0T$.§o 

[017] x&mmvmwicmzm&mx&Ze 
[0i8] mm^t h)Vf^yrmmn^<r)mm\^ 
?%mimx3b%o 

[01 9] 7>37is-h<Dmwicm%Mi&mx3bz 0 
[020] ^pmmm.^\m^fyv2-^^-vxh 

[02 1] *ggsi§5Qa^iiM^-r7P-^^-hT'fe 

[02 2] ^^7 > ^^SiMa^)i^1-7P-f-V-h 

[02 3] fy^a u-v-a y^s^-TM^;0T'fe?> o 

[02 4] ^S^tCt3lt§IM|gilSP(Dl#^^-r^P-V 

[02 5] |gi3ll^fc*5lt5lS^gPOli^^-ryPy 
^0T?£3 O 

[ftWSifi] 

1 P#y K 4 0 *-<yym%t>, 5 0 cc 

DA^7, 5 1- V-f^P^X 54- Xtf-A, 

6o-v-if^B»» % 6i ummm^, 6 2 ^ 

tSUgfl, 6 3 flfSiiiWgfl, 6 4 gp&fc^ 6 

5 **:Vs S1A B$!f8^ SIB, S3 

^W§, D1.D2 £1^Jt-*, R T 1 £ 



[0 11] 

R:ftliQ>tt-y >T?f . < fcHl 1 L£t". 

H:*%L<fcKl>L**-. 



HI 1 (2) 



(23) 



2003-255989 



3 WgjJX-ar h 




3 HrtP^l- ty h 




2 h 



5B HHfc-a— ? h 



1 n*»h 



A14 34 

02 #£ft<D#ffii::££Q7t?-y ha>*m**& (2) 



[05] 



.53 



r w 3 



^50 



CCD 



<54 



-S3 



QOM 



40A 



-S2A 





























*45 i 


*55 


*56 




*58 



•57 



0 5 ntfy h<DP»iSM*i* <2) 




I 



T 



-0 



0 8 «tiT?B4ZW?JI (D 



(24) 



mm2 0 0 3-2 5 5 9 8 9 



[03] 



[04] 



3 &u=L=i? h 



4B MS 




3 h 



2 WfrWarv h 
' 4A 



S3 *IS!fe<0#©lc<**o#y hearts^ (3) 





















r IM 
































/a) 






























CO 








A,5 vk 























— , , — 




— If-— 










5A «1^J.=.^ h 



1 PTK^h 



04 Dtf? hCO^SMltf (1) 



SIB 

14 



SIA 



[06] 



60 



Dt 



61 


63 


D2 


64 

i 












. 






turn 







62 




12 9 €iireS&W£OI (2) 



(25) 



2003-255989 



[01 2] 



[013] 



1 


2 


Z>i>* 


2 






4 


8 





Hi 2 F I DfttfS I Dfc*wl<OSrfc!S© 



[01 4] 



HI 3 stvzm (3) 



(KM**!***-. J 



II 4 ttfcfcl (4) 



7 s r 



Kdfl 



-77 



t 



~5~ 
AtfiL 



[01 5] 



ggSBg 1 



73 



mow \ r 



74 



76 



[01 6] 







boku r«j 

chisau [331 
doko [£d] 
genki £tcS] 
fro [fej 
janai [C*$:i>] 

kudasal C<££1*] 


boku 
chisau 
doko 

iro 
janai 
kirai 
kudssai 



a 1 e miss** 



60 



0 1 5 #^KKSJC0«m 



[01 7] 



1 8] 



[kono | sono] iro va: 
kore £(ga I wa I mo)]: 
(Chaeaul lie) (Sail]: 
tail: 

(desu [ da) [yo] | yo; 
yo]: 



jut t f ines I tines] Scot | [$neg] Jthis; 
iro] (deau | janai | da) [yo]; 
lor $garbago $color2; 



ea 1 7 x&auN 



ID 






1 






2 












N 













118 WR** h)\,'lyZ>7 



(26) 



ttPB 2003-255989 



1 9] 



[02 0] 



ID 










1 


2 


3 




N 


1 




i 


1 


s(1.1) 


«(1,2> 


s(l.3> 




8(1. » 


2 




2 


2 


8(2. 1) 


•a 2) 


8(2.3) 




8(2. N) 


3 




i 


1 


8(3. 1) 


8<3 r 2) 


8(3,3) 




8(3. » 




















N 




i 


1 


sOU) 


«0I. 2) 


b(H.3) 




8(N.N) 















ASLMl 

A&JtlU 



119 ^37v- K 



W2^ W fig > -^SP30 



]^P32 



— \— -SP35 



( j|7 > -v^SP36 

i2o tt&m&m^m 



[02 l] 



m^r ^ mis >^sp4o 



X 



***** v**£«ui 



SP43 



* n 

»^jawii || 



X 



( gy > ^SP54 



[02 2] 



< 



SPB2 



J! 



tttis s ^ x *a>$?!fti& t r * 



C 'J*-> > -^SP67 



0 2 1 5KfiSBA9^HH 



E2 2 



(27) 



&m 2003-255989 



[02 3] 







doroa: 


JUSxi: 


kuro 


Bgx3: 


Nfuro 


S3x20; 


NboNn 


*x.lB; 


hcHN 
NhoNrfa ' 


*x6: 
*xi0; 


KhoHcfe:8U 
ohoN 


*x4; 

^•Ui/^xi:*xt9: 


hoMgda*85oNre:a: 


*x2; 


aimodori : 
on i dor i : 


Mtfexii: 
SfexiO; 


e: Imldori: 


»&x3; 


Nmldori: 


»fex5; 


armldorl riroireau 


Sfex4: 


Nro:ka 


BTxID; 


Hro:kaSa 


BSTxlO; 



19 2 3 y;i I/-/3 >t§* 
[02 5 ] 



SI A 




^91 








isnssa 






feast 









[024] 



































fttie* 














tern* 





S1A ' 1 ' ' 

62 



92 4 tzvmznttzm&ist&ontuftmfa. 



1112 5 g&ftlcfcltSifSiaSBOTHWt 



(51)Int.Cl. 

G 1 0 L 15/06 
15/20 
17/00 



ISSUE* 



mmii5p a pJIIEdba a JII6Tg 7#35^V: 

(72)$^ iSU 

SMWUEJto^lieTS 7#35^V: 

(72)HW# ^« B£ 

*SU^JIieyfcfiiJI|6TB 7#35^V: 



F I 

G 1 0 L 3/00 



-?3-K (##) 



5 4 5 A 

5 2 1 J 

5 4 5 E 

5 3 1 P 



F*-M##) 2C150 BA11 CA01 DA04 DA24 DA26 
DA27 DA28 DF03 DF04 DF06 
DF33 ED42 ED47 ED52 EF07 
EF16 EF17 EF22 EF23 EF29 
EF33 EF36 

3C007 AS36 CS08 HS09 HS27 KS23 
KS31 KS39 KT01 KT11 LW03 
U12 MT00 MT14 WA03 WA13 
WB17 WB19 WC07 

5D015 KK02 KK04 LL02 

5D045 AB11 



PATENT ABSTRACTS OF JAPAN 
(1 1 publication number : 2003-255989 
(43)Date of publication of application : 1 0.09.2003 



(51)lnt.CI. G10L 15/22 
A63H 11/00 
B25J 5/00 
G10L 13/00 
G10L 15/00 
G10L 15/06 
G10L 15/20 
G10L 17/00 



(21 )Application number : 2002-060425 (71 )Applicant : SONY CORP 



(22)Date of filing : 06.03.2002 (72)lnventor : SHIMOMURA HIDEKI 
AOYAMA KAZUMI 
YAMADA KEIICHI 
ASANO KOJI 
OKUBO ATSUSHI 



(54) LEARNING SYSTEM AND LEARNING METHOD, AND ROBOT 
APPARATUS 

(57)Abstract: 

PROBLEM TO BE SOLVED: To solve the problem with the conventional robot 
apparatus, etc., that learning of names in a natural form is heretofore not- 
possible. 

SOLUTION: The system is so arranged that the names of objects are 
successively learned by getting the names of the target objects through 



interaction with man; storing these names in association with data on a plurality 
of different characteristics detected relating to the target objects; recognizing the 
appearance of novel objects in accordance with respective pieces of the stored 
data and the associated information; acquiring the data on the names and 
various characteristics of the novel men; and storing such associated 
information. 



LEGAL STATUS [Date of request for examination] 12.09.2003 
[Date of sending the examiner's decision of rejection] 

[Kind of final disposal of application other than the examiner's decision of 
rejection or application converted registration] 
[Date of final disposal for application] 
[Patent number] 3529049 
[Date of registration] 05.03.2004 

[Number of appeal against examiner's decision of rejection] 

[Date of requesting appeal against examiner's decision of rejection] 

[Date of extinction of right] 



* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not 
reflect the original precisely. 

2. **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



CLAIMS 
[Claim(s)] 

[Claim 1] While detecting the predetermined description that the body made into 
the above-mentioned object, respectively differs from a dialogue means to have— 
a function for conversing with human being, and to acquire the identifier of the 
target body from the above-mentioned human being through the dialogue 
concerned Two or more recognition means to recognize the body made into the 
object concerned based on the data of the above-mentioned description with 
which the detection result concerned and the known above-mentioned body 
memorized beforehand correspond, A storage means which associated the 
above identifier of the body of the above-mentioned known, and the recognition 



result of each above-mentioned recognition means against the body concerned 
to relate and to memorize information, The identifier of the body made into the 
above-mentioned object which the above-mentioned dialogue means acquired, 
the recognition result of each above-mentioned recognition means against the 
body made into the object concerned, and the body which a decision means to 
judge whether the body made into the above-mentioned object is the new 
above-mentioned body based on the above-mentioned correlation information 
which the above-mentioned storage means memorizes, and the _ 
above-mentioned decision means make the above-mentioned object - the 
above, when it is judged as a new body Study equipment characterized by 
having the control means which makes the above-mentioned storage means 
memorize the above-mentioned correlation information about the body made 
into the object concerned while making each above-mentioned recognition 
means memorize the data of the above-mentioned description with which the 
body made into the object concerned corresponds, respectively. 
[Claim 2] The above-mentioned control means is study equipment according to 
claim 1 characterized by controlling an above-mentioned recognition means by 
which the body made into the object concerned has been recognized correctly to 
carry out additional study when the body which the above-mentioned decision 
means makes the above-mentioned object is judged to be the above-mentioned 



known body. 

[Claim 3] The above-mentioned control means is study equipment according to 
claim 1 characterized by controlling an above-mentioned recognition means by 
which the body made into the object concerned has not been recognized 
correctly to carry out correction study when the body which the above-mentioned 
decision means makes the above-mentioned object is judged to be the 
above-mentioned known body. 

[Claim 4] the body carry out as the above-mentioned object by the majority of the 
identifier of the body which makes into the above-mentioned object which the 
above-mentioned dialogue means acquired while the above-mentioned decision^ 
means refers to the above-mentioned correlation information which the 
above-mentioned storage means memorizes, and the recognition result of each 
above-mentioned recognition means against the body concerned — the above — 
the study equipment according to claim 1 characterized by to judge whether it is 
a new body. 

[Claim 5] The above-mentioned control means is study equipment according to 
claim 1 characterized by controlling a dialogue means to extend the 
above-mentioned dialogue with the above-mentioned human being if needed. 
[Claim 6] While conversing with human being and acquiring the identifier of the 
target body from the above-mentioned human being through the dialogue 



concerned The 1st step which recognizes the body which detects the 
predetermined description that the plurality of the body made into the 
above-mentioned object differs, respectively, and is made into the object 
concerned based on the data of the detection result concerned and each 
above-mentioned description of the known above-mentioned body memorized 
beforehand, The identifier of the body made into the acquired above-mentioned 
object, and each recognition result respectively based on each above-mentioned 
description of the body made into the object concerned, The 3rd step which 
judges whether the body which associated the above identifier of the 
above-mentioned known body memorized beforehand and the recognition result 
of each above-mentioned recognition means against the body concerned, and 
which is associated and is made into the above-mentioned object based on — 
information is the new above-mentioned body, the body made into the 
above-mentioned object — the above - the study approach characterized by 
having the 4th step which memorizes the above-mentioned correlation 
information about the data of each above-mentioned description of the body 
made into the object concerned, and the body made into the object concerned, 
respectively when it is judged as a new body. 

[Claim 7] The study approach according to claim 6 characterized by carrying out 
additional study about the above-mentioned description which has recognized 



correctly the body made into the object concerned when the body made into the 
above-mentioned object is judged to be the above-mentioned known body at the 
4th step of the above. 

[Claim 8] The study approach according to claim 6 characterized by carrying out 
correction study about the above-mentioned description which has not 
recognized correctly the body made into the object concerned when the body 
made into the above-mentioned object is judged to be the above-mentioned 
known body at the 4th step of the above. 

[Claim 9] the body made into the above-mentioned object at the 3rd step of the 
above by the majority of each recognition result based on the identifier of the 
body made into the acquired above-mentioned object, and each 
above-mentioned description of the body concerned, respectively while referring 
to the above-mentioned correlation information — the above — the study 
approach according to claim 6 characterized by judging whether it is a new bodyr- 
[Claim 10] The study approach according to claim 6 characterized by extending 
the dialogue concerned if needed at the 4th step of the above while conversing 
with the above-mentioned human being. 

[Claim 1 1] While detecting the predetermined description that the body made 
into the above-mentioned object, respectively differs from a dialogue means to 
have a function for conversing with human being, and to acquire the identifier of 



the target body from the above-mentioned human being through the dialogue 
concerned Two or more recognition means to recognize the body made into the 
object concerned based on the data of the above-mentioned description with 
which the detection result concerned and the known above-mentioned body 
memorized beforehand correspond, A storage means which associated the 
above identifier of the body of the above-mentioned known, and the recognition 
result of each above-mentioned recognition means against the body concerned 
to relate and to memorize information, The identifier of the body made into the 
above-mentioned object which the above-mentioned dialogue means acquired/ 
the recognition result of each above-mentioned recognition means against the 
body made into the object concerned, and the body which a decision means to 
judge whether the body made into the above-mentioned object is the new 
above-mentioned body based on the above-mentioned correlation information 
which the above-mentioned storage means memorizes, and the 
above-mentioned decision means make the above-mentioned object — the 
above, when it is judged as a new body Robot equipment characterized by 
having the control means which makes the above-mentioned storage means 
memorize the above-mentioned correlation information about the body made 
into the object concerned while making each above-mentioned recognition 
means memorize the data of the above-mentioned description with which the 



body made into the object concerned corresponds, respectively. 
[Claim 12] The above-mentioned control means is robot equipment according to 
claim 1 1 characterized by controlling an above-mentioned recognition means by 
which the body made into the object concerned has been recognized correctly to 
carry out additional study when the body which the above-mentioned decision 
means makes the above-mentioned object is judged to be the above-mentioned 
known body. 

[Claim 13] The above-mentioned control means is robot equipment according to 
claim 1 1 characterized by controlling an above-mentioned recognition means by 
which the body made into the object concerned has not been recognized 
correctly to carry out correction study when the body which the above-mentioned 
decision means makes the above-mentioned object is judged to be the 
above-mentioned known body. 

[Claim 14] the body which carries out as the above-mentioned object by the 
majority of the identifier of the body which makes into the above-mentioned 
object which the above-mentioned dialogue means acquired while the 
above-mentioned decision means refers to the above-mentioned correlation 
information which the above-mentioned storage means memorizes, and the 
recognition result of each above-mentioned recognition means against the body 
concerned — the above — the robot equipment according to claim 1 1 



characterized by to judge whether it is a new body. 

[Claim 15] The above-mentioned control means is robot equipment according to 
claim 1 1 characterized by controlling a dialogue means to extend the 
above-mentioned dialogue with the above-mentioned human being if needed. 

DETAILED DESCRIPTION 

[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention is applied to an entertainment robot, 
concerning robot equipment, and is suitable for study equipment and the study 
approach list. 
[0002] 

[Description of the Prior Art] In recent years, many entertainment robots for 
ordinary homes are commercialized. And various external sensors, such as a 
CCD (ChargeCoupled Device) camera and a microphone, are carried, an 
outside circumference is recognized based on the output of these external 
sensor, and there are some entertainment robots made as [ act / based on a 
recognition result / it / autonomously ]. 
[0003] 

[Problem(s) to be Solved by the Invention] By the way, it sets to this_ 



entertainment robot and is a new body (a person also contains.). Hereafter, it is 
the same. If an identifier is matched with the body and it can be memorized, 
communication with a user can be made smoother and it will be thought that it 
can make it possible to correspond also to the various instructions for bodies 
other than the body with which the identifier was registered beforehand "kick a 
ball" from a user, flexibly. In addition, it shall express matching an objective 
identifier with the body and memorizing it to below as mentioned above, as "an 
identifier is learned", and such a function shall be called an "identifier learning 
function." 

[0004] Moreover, if an entertainment robot can make it possible to learn the 
identifier of a new body through a dialogue with the usual man so that it may face 
carrying such an identifier learning function in an entertainment robot and human 
being may carry out usually, it will think from the natural gender and it will be 
thought that it is the most desirable and the entertainment nature as an 
entertainment robot may be raised further. — 
[0005] However, with the conventional technique, there is a problem with difficult 
making an entertainment robot judge when the new body which should learn an 
identifier has appeared at hand. 

[0006] For this reason, in the former, after carrying out press actuation of the 
specific touch sensor which the user gave the explicit voice command or was 



arranged by the robot and changing a mode of operation into register mode, 
many technique of performing objective recognition and registration of that" 
identifier is used. However, when considering the natural interaction of a user 
and an entertainment robot, the identifier registration by such explicit directions 
had a truly unnatural problem. 

[0007] This invention was made in consideration of the above point, and tends to 
propose robot equipment in the study equipment and the study approach list 
which may raise entertainment nature on a target markedly. 
[0008] 

[Means for Solving the Problem] In order to solve this technical problem, it sets 
to this invention. While detecting the predetermined description that the target 
body differs from a dialogue means to have a function for conversing with human 
being in study equipment, and to acquire the identifier of the target body from 
human being through the dialogue concerned, respectively Two or more 
recognition means to recognize the body made into the object concerned based 
on the data of the description with which the detection result concerned and the 
known body memorized beforehand correspond, A storage means whicfr 
associated the identifier of a known body, and the recognition result of each 
recognition means against the body concerned to relate and to memorize 
information, A decision means to judge whether the target body is a new body 



based on the identifier of the body made into the object which the dialogue 
means acquired, the recognition result of each recognition means against the 
body made into the object concerned, and the correlation information that a 
storage means memorizes, When a decision means judged the target body to be 
a new body, while making each recognition means memorize the data of the 
description with which the body made into the object concerned corresponds, 
respectively, the control means which it relates [ control means ] about the body 
made into the object concerned, and makes a storage means memorize 
information was established. 

[0009] As a result, without needing the identifier registration by the explicit 
directions from users, such as an input of a voice command, and press actuation 
of a touch sensor, this study equipment can learn the identifier of a new person, 
a body, etc. automatically through a dialogue with the usual man so that human 
being may carry out usually. 

[0010] Moreover, while conversing with human being and acquiring the identifier 
of the target body from human being through the dialogue concerned in the 
study approach in this invention The 1st step which recognizes the body which 
detects the predetermined description that the plurality of the target body differs, 
respectively, and is made into the object concerned based on the detection 
result concerned and the data of each description of a known body memorized 



beforehand, The identifier of the body made into the acquired object, and each 
recognition result respectively based on each description of the body made into 
the object concerned, The 3rd step which relates and judges whether the target 
body for which the identifier of the known body memorized beforehand and the 
recognition result of each recognition means against the body concerned were 
associated is a new body based on information, When the target body was- 
judged to be a new body, the 4th step which relates about the data of each 
description of the body made into the object concerned and the body made into 
the object concerned, and memorizes information, respectively was prepared. 
[0011] Consequently, according to this study approach, without needing the 
identifier registration by the explicit directions from users, such as an input of a 
voice command, and press actuation of a touch sensor, the identifier of a new 
person, a body, etc. can be automatically learned through a dialogue with the 
usual man so that human being may carry out usually. 

[0012] Furthermore, in this invention, in robot equipment, while detecting the 
predetermined description that the target body differs from a dialogue means to 
have a function for conversing with human being, and to acquire the identifier of 
the target body from human being through the dialogue concerned, respectively 
Two or more recognition means to recognize the body made into the object 
concerned based on the data of the description with which the detection result 



concerned and the known body memorized beforehand correspond, A storage 
means which associated the identifier of a known body, and the recognition 
result of each recognition means against the body concerned to relate and to 
memorize information, A decision means to judge whether the target body is a 
new body based on the identifier of the body made into the object which the 
dialogue means acquired, the recognition result of each recognition means 
against the body made into the object concerned, and the correlation information 
that a storage means memorizes, When a decision means judged the target- 
body to be a new body, while making each recognition means memorize the 
data of the description with which the body made into the object concerned 
corresponds, respectively, the control means which it relates [ control means ] 
about the body made into the object concerned, and makes a storage means 
memorize information was established. 

[0013] Consequently, without needing the identifier registration by the explicit 
directions from users, such as an input of a voice command, and press actuation 
of a touch sensor, this robot equipment can learn the identifier of a new person, 
a body, etc. automatically through a dialogue with the usual man so that human 
being may carry out usually. 
[0014] 

[Embodiment of the Invention] About a drawing, the gestalt of 1 operation of this 



invention is explained in full detail below. 

[0015] (1) While 1 shows the robot of the 2-pair-of-shoes walk mold by the 
gestalt of this operation as a whole and the head unit 3 is arranged in the upper 
part of the idiosoma unit 2 in a robot's block diagram 1 and drawing 2 by the 
gestalt of this operation It is constituted by arranging the arm units 4A and 4B of 
the respectively same configuration as up right and left of the idiosoma unit 2 
concerned, respectively, and attaching the leg units 5A and 5B of the 
respectively same configuration as the lower left right of the idiosoma unit 2 in a 
predetermined location, respectively. 

[0016] It is constituted when the waist base 11 which forms the frame 10 which- 
forms the truncus upper part, and a lower trunk in the idiosoma unit 2 connects 
through the waist joint device 12. It is made as [ rotate / around the roll axes 13 
which are shown in drawing 3 and which intersect perpendicularly, and a 
pitching axis 14 /, respectively / the truncus upper part / independently ] by 
driving each actuators A1 and A2 of the waist joint device 12 fixed to the waist 
base 1 1 of a lower trunk, respectively. 

[0017] Moreover, the head unit 3 is attached in the top-face center section of the 
shoulder base 15 fixed to the upper limit of a frame 10 through the neck joint 
device 16, and is made as [ make / it / to rotate independently, respectively 
around the pitching axis 17 which is shown in drawing 3 and which intersects 



perpendicularly, and a yawing axis 18 ] by driving each actuator A3 of the neck 
joint device 16 concerned, and A4, respectively. 

[0018] Furthermore, each arm units 4A and 4B are attached in right and left of 
the shoulder base 15 through the shoulder-joint device 19, respectively, and are 
made as [ make / it / to rotate independently, respectively around each 
corresponding actuator A5 of the shoulder-joint device 19, the pitching axis 20 
which is shown in drawing 3 by driving A6, respectively and which intersects 
perpendicularly, and roll axes 21 ]. 

[0019] In this case, the actuator A8 which forms the forearm section in the output 
shaft of the actuator A7 which forms the overarm section, respectively through 
the elbow-joint device 22 is connected, and each arm units 4A and 4B are 
constituted by attaching a hand part 23 at the tip of the forearm section^ 
concerned. 

[0020] And in each arm units 4A and 4B, it is made by driving an actuator A7 as 
[ make / it can be made to be able to rotate around the yawing axis 24 which 
shows the forearm section to drawing 3 , and / it / to rotate, respectively around 
the pitching axis 25 which shows the forearm section to drawing 3 by driving an 
actuator A8 ]. 

[0021] on the other hand, each actuator of the hip joint device 26 which is 
attached in the waist base 1 1 of a lower trunk through the hip joint device 26, 



respectively, and corresponds in each leg units 5A and 5B, respectively — 
A9-A11 - it is made by driving, respectively as [ make / it / to rotate 
independently, respectively around the yawing axis 27 which intersects 
perpendicularly mutually and roll axes 28 which are shown in drawing 3 , and a 
pitching axis 29 ]. 

[0022] In this case, each leg units 5A and 5B are constituted by connecting a 
foot 34 with the lower limit of the frame 32 concerned through the ankle joint 
device 33 while the frame 32 which forms the leg section in the lower limit of the 
frame 30 which forms a femoral region, respectively through the knee-joint 
device 31 is connected. 

[0023] This sets to each leg units 5A and 5B. By driving the actuator A12 which 
forms the knee-joint device 31 By being able to make it rotate around the 
pitching axis 35 which shows the leg section to drawing 3 , and driving the 
actuators A13 and A14 of the ankle joint device 33, respectively It is made as- 
[ make / it / to rotate independently, respectively around the pitching axis 36 
which shows a foot 34 to drawing 3 and which intersects perpendicularly, and 
roll axes 37 ]. 

[0024] On the other hand, as shown in drawing 4 , the Maine control section 40 
which manages the motion control of the robot 1 whole concerned, the 
circumference circuits 41, such as a power circuit and a communication circuit, 



and the control unit 42 with which a box comes to contain a dc-battery 45- 
( drawing 5 ) etc. are arranged in the tooth-back side of the waist base 1 1 which 
forms the lower trunk of the idiosoma unit 2. 

[0025] And it connects with each sub control sections 43A-43D arranged, 
respectively in each configuration unit (the idiosoma unit 2, the head unit 3, each 
arm units 4A and 4B, and each leg units 5A and 5B), and this control unit 42 is 
made as [ communicate / required supply voltage can be supplied or / with these 
sub control sections 43A-43D / it / to these sub control sections 43A-43D, ]. 
[0026] Moreover, it connects with each actuators A1-A14 in the configuration 
unit which corresponds, respectively, and each sub control sections 43A-43D 
are made as [ drive / it / in the condition of having been specified based on the 
various control command to which each actuators A1-A14 in the configuration 
unit concerned are given from the Maine control section 40 ]. 
[0027] The external sensor section 53 which furthermore becomes the head unit 
3 from a microphone 51, a touch sensor 52, etc. which function as the CCD 
(Charge Coupled Device) camera 50 which functions as this robot's 1 "eyesV 
and a "lug" as shown in drawing 5 , The loudspeaker 54 which functions as 
"opening" is arranged in a predetermined location, respectively, and the internal 
sensor section 57 which consists of a dc-battery sensor 55, an acceleration 
sensor 56, etc. is arranged in the control unit 42. 



[0028] And while CCD camera 50 of the external sensor section 53 picturizes a 
surrounding situation and obtained picture signal S1A is sent out to the Maine 
control section, a microphone 51 collects various instruction voice given as voice 
input from a user, such as "walk", "lie down", or "pursue a ball", and is made as 
[ send / to the Maine control section 40 / sound signal S1 B obtained in this way ]. 
[0029] Moreover, in drawing 1 and drawing 2 , the touch sensor 52 is formed in 
the upper part of the head unit 3 so that clearly, it detects the pressure received 
by "it strokes" and the physical influence of "striking" from a user, and sends out 
a detection result to the Maine control section 40 as pressure detecting-signal 
S1C. 

[0030] Furthermore, while the dc-battery sensor 55 of the internal sensor section- 
57 detects the energy residue of a dc-battery 45 a predetermined period and 
sends out a detection result to the Maine control section 40 as dc-battery residue 
detecting-signal S2A, an acceleration sensor 56 detects the acceleration of 3 
shaft orientations (a x axis, y-axis, and z-axis) a predetermined period, and it 
sends out a detection result to the Maine control section 40 as an acceleration 
detecting-signal S2B. 

[0031] Picture signal S1A to which the Maine control-section section 40 is 
supplied, respectively from CCD camera 50, the microphone 51, and touch 
sensor 52 grade of the external sensor section 53, Sound signal S1B, pressure 



detecting-signal S1C (these are hereafter called collectively the external sensor 
signal S1), etc., Dc-battery residue detecting-signal S2A, acceleration 
detecting-signal S2B, etc. which are supplied, respectively from the dc-battery 
sensor 55, an acceleration sensor, etc. of the internal sensor section 57 It is 
based on (these are hereafter called collectively the internal sensor signal S2), 
and a robot's 1 perimeter and an internal situation, the existence of the 
command from a user and the influence from a user, etc. are judged. 
[0032] And the Maine control section 40 opts for the action which continues 
based on this decision result, the control program beforehand stored in 
internal-memory 40A, and the various control parameters stored in the external 
memory 58 with which it is then loaded, and sends out the control command 
based on a decision result to the corresponding sub control sections 43A-43D. 
Consequently, based on this control command, the actuators A1-A14~ 
corresponding to the basis of control of those sub control sections 43A-43D drive, 
will make the head unit 3 rock vertically and horizontally in this way, the arm 
units 4A and 4B will be raised upwards, or action of walking will be discovered by 
the robot 1 . 

[0033] Moreover, the Maine control section 40 blinks this by making the voice 
based on the sound signal S3 concerned output outside, or outputting a driving 
signal to LED prepared in the predetermined location of the head unit 3 which 



functions as a "eye" on appearance by giving the predetermined sound signal S3 
to a loudspeaker 54 if needed in this case. 

[0034] Thus, in this robot 1 , it is made as [ act / based on the situation of a 
perimeter and the interior, the existence of the command from a user, and 
influence, etc. / it / autonomously ]. 

[0035] (2) processing of the Maine control section 40 about an identifier learning 
function - explain the identifier learning function carried in this robot 1 next. 
[0036] To this robot 1, that man's identifier is acquired through a dialogue with 
people. The identifier concerned Relate with each data of the acoustical 
description of the voice of the man who detected based on the output of a 
microphone 51 or CCD camera 50, and the gestalt-description of a face, and 
while memorizing Based on each these-memorized data, recognize an 
appearance of the new person who does not acquire the identifier, and acquire 
the new person's identifier, the acoustical description of voice, and the 
gestalt-description of a face like ****, and they are memorized. People's identifier- 
is matched with the man and the identifier learning function which carries out 
acquisition (this is hereafter called study of identifier) study is carried. In addition, 
"a known man", and a call and those who have finished memorizing shall be 
called "a new person" for the person who matches with the acoustical 
description of the man's voice, and the gestalt-description of a face below, and 



finished memorizing an identifier. 

[0037] And this identifier learning function is realized by the various processings^ 
in the Maine control section 40. 

[0038] If the contents of processing of the Maine control section 40 about this 
identifier learning function are classified here functionally, as shown in drawing 6 
The speech recognition section 60 which recognizes the language which people 
uttered, and the speaker-recognition section 61 which identifies and recognizes 
the man based on the detected acoustical description concerned while detecting 
the acoustical description of people's voice, The face recognition section 62 
which identifies and recognizes the man based on the detected 
gestalt-description concerned while detecting the gestalt-description of people's 
face, The dialogue control section 63 which manages the various control for 
identifier study of a new person including dialogue control with people, and the 
storage management of a known man's identifier, the acoustical description of 
voice, and the gestalt-description of a face, It can divide into the speech 
synthesis section 64 which generates the sound signal S3 for [ various ] a 
dialogue on the basis of control of the dialogue control section 63, and is sent 
out to a loudspeaker 54 ( drawing 5 ). 

[0039] In this case, in the speech recognition section 60, by performing 
predetermined speech recognition processing based on sound signal S1B from 



a microphone 51 ( drawing 5 ), it has the function to recognize the language 
contained in the sound signal S1B concerned per word, and is made as [ send / 
to the dialogue control section 63 / by using these recognized words as 
character-string data D1 ]. ^_ 
[0040] Moreover, the speaker-recognition section 61 has the function detected 
by predetermined signal processing using the approach indicated by 
"Segregation of Speakers for Recognition and Speaker Identification 
(CH2977-7/91/0000-0873 S1.00 1991 IEEE)" in the acoustical description of 
people's voice contained in sound signal S1B given from a microphone 51. 
[0041] And the speaker-recognition section 61 carries out a sequential 
comparison with the data of the acoustical description of all the known men that 
have usually sometimes memorized the data of this detected acoustical 
description then. When the acoustical description then detected is in agreement 
with the acoustical description of the man of either known, with the acoustical 
description of the known man concerned, and the matched acoustical 
description concerned The identifier of a proper While notifying (this is hereafter 
called SID) to the dialogue control section 63, when the detected acoustical 
description is in agreement with neither of the known men's acoustical 
descriptions, it is made as [ notify / to the dialogue control section 63 / SID (= -1^- 
which means recognition impossible ]. 



[0042] Moreover, when the dialogue control section 63 judges that he is a new 
person, the speaker-recognition section 61 detects the acoustical description of 
that man's voice in the meantime based on the initiation instruction of new study 
and study termination instruction which are given from the dialogue control 
section 63 concerned, and it is made as [ notify / to the dialogue control section 
63 / this SID ] while matching the data of the detected acoustical description 
concerned with SID of a new proper and memorizing them. 
[0043] In addition, the speaker-recognition section 61 is made as [ carry / 
according to an initiation instruction and a termination instruction of the 
additional study from the dialogue control section 63, or correction study / the 
additional study which collects the data of the acoustical description the man's 
voice additionally, and the correction study which corrects the data of the 
acoustical description the man's voice so that the man can be recognized 
correctly ]. 

[0044] In the face recognition section 62, picture signal S1A given from CCD 
camera 50 ( drawing 5 ) is monitored continuously, and it has the function in 
which predetermined signal processing detects the gesta It-description of 
people's face contained in the image based on the picture signal S1 A concerned. 
[0045] And the face recognition section 62 carries out a sequential comparison 
with the data of the gestalt-description of the face of all the known men that have 



usually sometimes memorized the data of this detected gestalt-description then. 
With the gestalt-description concerned matched with the gestalt-description of 
the known man concerned when the gestalt-description then detected was in 
agreement with the gestalt-description of the face of the man of either known, 
the identifier of a proper While notifying (this is hereafter called FID) to a 
dialogue control section, when the detected gestalt-description is not in 
agreement with the gestalt-description of which known man's face, it is made as 
[ notify / to a dialogue control section / FID (= -1) which means recognition^ 
impossible ]. 

[0046] Moreover, it is based on the study initiation instruction and study 
termination instruction which are given from the dialogue control section 63 
concerned when it judges that the face recognition section 62 is a person with 
the new dialogue control section 63. The gestalt-description of people's face 
contained in the meantime in the image based on picture signal S1A from CCD 
camera 50 is detected, and while matching the data of the detected 
gestalt-description concerned with FID of a new proper and memorizing them, it 
is made as [ notify / to the dialogue control section 63 / this FID ]. 
[0047] In addition, the face recognition section 62 is made as [ carry / according 
to an initiation instruction and a termination instruction of the additional study 
from the dialogue control section 63, or correction study / the additional study 



which collects the data of the gestalt-description people's face additionally, and 
the correction study which corrects the data of the gestalt-description people's 
face so that the man can be recognized correctly J. — 
[0048] The speech synthesis section 64 has the function to change into a sound 
signal S3 the character-string data D2 given from the dialogue control section 63, 
and is made as [ send / to a loudspeaker 54 ( drawing 5 ) / the sound signal S3 
acquired in this way ]. It is made as [ make / by this / the voice based on this 
sound signal S3 / to output from a loudspeaker 54 ]. 

[0049] In the dialogue control section 63, as shown in drawing 7 , it has the 
memory 65 ( drawing 6 ) which associates and memorizes SID matched with the 
data of the acoustical description of the voice of a known man's identifier, and its 
man whom the speaker-recognition section 61 has memorized, and FID 
matched with the data of the gestalt-description the man's face which the face 
recognition section 62 has memorized. 

[0050] The dialogue control section 63 and by giving the character-string data 
D2 predetermined to predetermined timing to the speech synthesis section 64 
While making the voice for talking, and asking an identifier to a partner's man, or 
checking an identifier etc. output from a loudspeaker 54 The recognition result of- 
the face recognition section [ as opposed to that man to each recognition result 
list of the speech recognition section 60 based on a response of that man at this 



time etc., and the speaker-recognition section 61 ] 62, It is made as [judge / the 
man / whether you are a new person / based on an above-mentioned known 
man's identifier stored in memory 65, and the information on correlation of SID 
and FID]. 

[0051] and when it judges that the dialogue control section 63 is a person wittL 
the new man By giving an initiation instruction and a termination instruction of 
new study to the speaker-recognition section 61 and the face recognition section 
62 While making these speaker-recognitions section 61 and the face recognition 
section 62 collect and memorize the data of the acoustical description of the new 
person's voice, or the gestalt-description of a face SID and FID which were 
matched with the data of the acoustical description that new person's voice and 
the data of the gestalt-description of a face which are given as this result, 
respectively from these speaker-recognitions section 61 and the face recognition 
section 62 It is made as [ store / in memory 65 / relate with the man's identifier 
acquired by this dialogue, and ]. 

[0052] moreover, when it judges that the man is a known man, the dialogue 
control section 63 While making additional study and correction study perform in 
the speaker-recognition section 61 and the face recognition section 62 by giving 
an initiation instruction of additional study and correction study to the 
speaker-recognition section 61 and the face recognition section 62 if needed By__ 



carrying out sequential sending out of the predetermined character-string data 
D2 to predetermined timing with this at the speech synthesis section 64 It is 
made as [ perform / for the speaker-recognition section 61 and the face 
recognition section 62 to carry out additional study or correction study / dialogue 
control which is the need and which prolongs a dialogue with the man until the 
data of an amount are fairly collectable ]. 

[0053] (3) Explain concrete processing of the dialogue control section 63 about- 
an identifier learning function, next the concrete contents of processing of the 
dialogue control section 63 about an identifier learning function. 
[0054] The dialogue control section 63 performs various processings for carrying 
out sequential study of a new person's identifier according to the identifier study 
procedure RT 1 shown in drawing 8 and drawing 9 based on the control program 
stored in external memory 58 ( drawing 5 ). 

[0055] Namely, will start in a step SP 0 and the dialogue control section 63 will 
set the identifier study procedure RT 1 to the continuing step SP 1, if FID is given 
from the face recognition section 62 concerned when the face recognition 
section 62 recognizes people's face based on picture signal S1A from CCD 
camera 50. The information which associated a known man's identifier stored in 
memory 65, and SID corresponding to this and FID corresponding to this It 
judges whether the identifier which is based on (this is associated and it is 



hereafter called information), and corresponds from the FID can be searched 
(that is, isn't FID which means recognition impossible or not?). 
[0056] Obtaining an affirmation result in this step SP 1 here means that that man 
is a known man who the face recognition section 62 has memorized the data of 
the gestalt-description that man's face, and FID matched with the data 
concerned relates with that man's identifier, and is stored in memory 65. 
However, it is also considered also in this case that the face recognition section 
62 has incorrect-recognized the new person to be a known man. 
[0057] then, when an affirmation result is obtained in a step SP 1, the dialogue- 
control section 63 By progressing to a step SP 2 and sending out the 
predetermined character-string data D2 to the speech synthesis section 64, as 
shown in drawing 10 , he is Mr. "OO. The voice of a question for the identifier of 
the man " to confirm whether it is in agreement with the identifier (identifier 
applicable to above-mentioned OO) searched from FID is made to output from a 
loudspeaker 54. 

[0058] Subsequently, the dialogue control section 63 awaits that progress to a 
step SP 3 and the speech recognition result of the man's "yes and that's right" to 
this question and the response "no, it is different" is given from the speech 
recognition section 60. And if this speech recognition result is soon given from 
the speech recognition section 63 and SID which it is as a result of [ at that time ] 



a speaker recognition is given from the speaker-recognition section 61, the 
dialogue control section 63 will progress to a step SP 4, and will judge whether 
the man's response is affirmative based on the speech recognition result from 
the speech recognition section 63. 

[0059] It means that it is in the condition that it can be concluded mostly that he 
is him who has the identifier with which the identifier of obtain [ here / in this step 
SP 4 / an affirmation result ] searched based on FID given from the face 
recognition section 62 in a step SP 1 corresponded with that man's identifier, 
therefore the dialogue control section 63 searched that man. 
[0060] In this way, at this time, the dialogue control section 63 concludes that 
that man is him who has the identifier which the dialogue control section 63~ 
concerned searched, it progresses to a step SP 5, and an initiation instruction of 
additional study is given to the speaker-recognition section 61. Moreover, with 
this, the dialogue control section 63 gives an initiation instruction of correction 
study, when it gives an initiation instruction of additional study to the 
speaker-recognition section 61 when SID first given from the 
speaker-recognition section 61 is in agreement with SID which was stored in 
memory 65 from this identifier and which is associated and can be searched 
based on information, and it is not in agreement to this. 

[0061] And the dialogue control section 63 progressed to a step SP 6 after this, 



for example, said like drawing 10 , "It crawls today, and it is and is the weather" 
etc. If sufficient predetermined time for sequential sending, next addition study, 
or correction study passes in the speech synthesis section 64, the 
character-string data D2 for carrying out idle talk for prolonging a dialogue with 
that man After progressing to a step SP 7 and giving a termination instruction of 
additional study or correction study to the speaker-recognition section 61 and 
the face recognition section 62, it progresses to a step SP 20 and the identifier 
study processing to the man is ended. 

[0062] On the other hand, it is a person with the new person by whom face 
recognition was done in the face recognition section 62 to obtain a negative 
result in a step SP 1, or it means that the face recognition section 62 has 
incorrect-recognized the known man to be a new person. Moreover, it means 
that the identifier's searched from FID first given from the face recognition- 
section 62 obtaining a negative result in a step SP 4 does not correspond with 
the man's identifier. And it can be said that it is in the condition that the dialogue 
control section 63 does not grasp the man correctly in the case of which [ these ]. 
[0063] Then, the dialogue control section 63 makes the voice of the question for 
finding out about the man's identifier "let me know that and an identifier" output 
from a loudspeaker 54 by progressing to a step SP 8 and giving character-string 
data D2 to the speech synthesis section 64, as shown in drawing 11 , when a 



negative result is obtained in a step SP 1, or when a negative result is obtained 
in a step SP 4. 

[0064] And the dialogue control section 63 awaits that progress to a step SP 9 
after this, and the speech recognition result (namely, identifier) of response "it is 
OO" to this question, and the speaker-recognition result (namely, SID) of the 
speaker-recognition section 61 at the time of the response concerned are given 
from the speech recognition section 60 and the speaker-recognition section 61, 
respectively. [ of that man ] 

[0065] And if a speech recognition result is soon given from the speech 
recognition section 60 and SID is given from the speaker-recognition section 61, 
the dialogue control section 63 will progress to a step SP 10, and will judge 
whether the man is a new person based on FID first given to these speech 
recognition result and the SID list from the face recognition section 62. 
[0066] In the case of the gestalt of this operation, this judgment is made here by 
the majority of three recognition results which become by the identifier acquired^ 
by the speech recognition of the speech recognition section 60, SID from the 
speaker-recognition section 61, and FID from the face recognition section 62. 
[0067] For example, it is judged that it is "-1" as which both FID(s) from SID and 
the face recognition section 62 from the speaker-recognition section 61 mean 
recognition impossible, and the man is a new person when the man's identifier 



which might be based on the speech recognition result from the speech 
recognition section 60 in step SP is not related with all SID or FID in memory 65. 
Since it is in the situation that those whom which known face or which known 
voice does not resemble closely have a completely new identifier, such decision 
can be performed. 

[0068] Moreover, it is judged that the dialogue control section 63 is as which 
it is related with the identifier from which FID from SID and the face recognition 
section 62 from the speaker-recognition section 61 differs in memory 65, or one 
of these means recognition impossible, and it is a person with the new man also 
when the man's identifier which might be based on the speech recognition resulF 
from the speech recognition section 60 in a step SP 9 is not stored in memory 65. 
It is what is easy to happen to incorrect-recognize a new category to be one of 
the known categories in various recognition processings, and this is because 
considering that the identifier by which speech recognition was carried out is not 
registered it can be judged as a new person with quite high reliability. 
[0069] On the other hand, the dialogue control section 63 judges that the man is 
a known man, when the man's identifier which FID from SID and the face 
recognition section 62 from the speaker-recognition section 61 is related with the 
same identifier in memory 65, and might be based on the speech recognition 
result from the speech recognition section 60 in a step SP 9 is an identifier with 



which the SID and FID were related. 

[0070] Moreover, in being the identifier with which either SID which requires the 
man's identifier which the dialogue control section 63 is related with the identifier 
from which FID from SID and the face recognition section 62 from the 
speaker-recognition section 61 differs in memory 65, and might be based on the 
speech recognition result from the speech recognition section 60 in a step SP 9, 
or FID was related, the man judges that he is a known man. In this case, since it 
is thought that the recognition result of either the speaker-recognition section 61 
and the face recognition section 62 is wrong, it judges such by this majority. 
[0071] On the other hand, the dialogue control section 63 is related with the 
identifier from which FID from SID and the face recognition section 62 from the 
speaker-recognition section 61 differs in memory 65. And when the man's" 
identifier which might be based on the speech recognition result from the speech 
recognition section 60 in a step SP 9 is an identifier related with neither [ this ] 
SID nor FID in memory 65, the man does not judge whether you are a known 
man or you are a new person. Although it is considered in this case that either 
the speech recognition section 60, the speaker-recognition section 61 and the 
face recognition section 62 and all recognition are also wrong, it cannot be 
judged in this phase. Therefore, this decision is suspended in this case. 
[0072] and — the case where the dialogue control section 63 judges that he is a 



person with this new man in a step SP 10 by such decision processing — a step 
SP 11 - progressing — an initiation instruction of new study — the 
speaker-recognition section 61 and the face recognition section 62 — giving — 
next - a step SP 12 — progressing — for example, drawing 1 1 — like - "~ I am a 
robot, thank you for your consideration. " - or — Mr. OO and today — it's a nice 
day, isn't it. " — etc. - the character-string data D2 for carrying out idle talk which 
prolongs a dialogue with the man are sent out to the speech synthesis section 64. 
[0073] Moreover, it repeats step SP12-SP13-SP's12 loop formation until it will 
return to a step SP 12 and will obtain an affirmation result in a step SP 13 after 
this, if the dialogue control section 63 judges whether it progressed to a step SP 
13 after this, and both collection of the data of the acoustical description in the 
speaker-recognition section 61 and collection of the data of the 
gestalt-description of the face in the face recognition section 62 reached the 
amount enough and a negative result is obtained. 

[0074] And if an affirmation result is obtained in a step SP 13 when both 
collection of the data of the acoustical description in the speaker-recognition 
section 61 and collection of the data of the gestalt-description of the face in the 
face recognition section 62 reach an amount enough soon, the dialogue control 
section 63 will progress to a step SP 14, and will give a termination instruction of 
new study to these speaker-recognitions section 61 and the face recognition 



section 62. Consequently, in the speaker-recognition section 61, the data of that 
acoustical description are matched with new SID, and are memorized, and in the 
face recognition section 62, the data of that gestalt-description are matched with 
new FID, and are memorized. 

[0075] Moreover, it relates with the identifier of that man that might be based [ in 
/ when it awaits that the dialogue control section 63 progresses to a step SP 15 
after this, and this SID and FID are given from the speaker-recognition section 
61 and the face recognition section 62 respectively and these are given soon, as 
it is shown, for example in drawing 12 / a step SP 9 ] on the speech recognition 
result from the speech recognition section 60 in these, and registers with 
memory 65. And the dialogue control section 63 progresses to a step SP 20 after 
this, and ends the identifier study processing to that man. 

[0076] on the other hand, when it is judged in a step SP 10 that this man is a 
known man, the dialogue control section 63 When it progressed to a step SP 16 
and the speaker-recognition section 61 and the face recognition section 62 can 
recognize the known man correctly (that is, the speaker-recognition section 61 
and the face recognition section 62) When same SID or same SID as SID 
corresponding to the known man stored in memory 65 as correlation information 
or FID is being outputted as a recognition result An initiation instruction of 
additional study is given to the speaker-recognition section 61 or the face 



recognition section 62. When the speaker-recognition section 61 and the face 
recognition section 62 have not recognized the known man correctly (that is, the 
speaker-recognition section 61 and the face recognition section 62) When same- 
SID or same SID as SID corresponding to the known man stored in memory 65 
as correlation information or FID is being outputted as a recognition result, an 
initiation instruction of correction study is given to the speaker-recognition 
section 61 or the face recognition section 62. 

[0077] SID from the speaker-recognition section 61 from which the dialogue 
control section 63 was specifically obtained in a step SP 9, FID first given from 
the face recognition section 62 is related with the same identifier in memory 65. 
and when the identifier which might be based on the speech recognition result 
from the speech recognition section 60 in a step SP 9 is an identifier with which 
the SID and FID were related and the man judges that he is a known man in a 
step SP 10 An initiation instruction of additional study is given to the 
speaker-recognition section 61 and the face recognition section 62, respectively. 
[0078} Moreover, SID from the speaker-recognition section 61 from which the 
speaker-recognition section 63 was obtained in a step SP 9, It is related with the 
identifier from which FID first given from the face recognition section 62 differs irr 
memory 65. and when the man judges that he is a known man in a step SP 10 
by being the identifier with which either SID which requires the identifier which 



might be based on the speech recognition result from the speech recognition 
section 60 in a step SP 9, or FID was related While SID related with the identifier 
which might be based on the speech recognition result from the speech 
recognition section 60, or FID was outputted gives an initiation instruction of 
additional study to the speaker-recognition section 61 or the face recognition- 
section 62. An initiation instruction of correction study is given to the face 
recognition section 62 or the speaker-recognition section 61 of another side 
which outputted FID which is not related with the identifier which might be based 
on the speech recognition result from the speech recognition section 60, or SID. 
[0079] And the dialogue control section 63 should move on to a step SP 17 after 
this, for example, should be shown in drawing 13 . "oh, he is Mr. OO. 
Recollections stripes were carried out. today — it's a nice day, isn't it. " and "last 
time — growing - when — meeting now — the bottom — ****. Sequential sending " 
- etc. — the character-string data D2 for carrying out idle talk for prolonging a 
dialogue with the man in the speech synthesis section 64 If predetermined time 
sufficient after this for additional study or correction study passes, after 
progressing to a step SP 18 and giving a termination instruction of additional 
study or correction study to the speaker-recognition section 61 and the face 
recognition section 62, it progresses to a step SP 20 and the identifier study 
processing to that man is ended. — 



[0080] on the other hand, when it is judged that it cannot judge with the dialogue 
control section 63 being a new person in a step SP 10 as this man is a known 
man, it progresses to a step SP 19, for example, is shown in drawing 14 — as — 
such — really. Are you fine? Sequential sending out of the character-string 
data D2 for carrying out idle talk of " etc. is carried out at the speech synthesis 
section 64. 

[0081] And in this case, it will progress to a step SP 20 and the dialogue control 
section 63 will end the identifier study processing to that man, if an initiation 
instruction and its termination instruction of new study and addition study or 
correction study are not given to the speaker-recognition section 61 and the face 
recognition section 62 (that is, both new study and addition study and correction 
study are not made to perform in the speaker-recognition section 61 and the 
face recognition section 62) but predetermined time passes. 
[0082] Thus, the dialogue control section 63 is made as [ carry out / sequential 
study of a new person's identifier ] based on each recognition result of the 
speech recognition section 60, the speaker-recognition section 61, and the face 
recognition section 62 by performing dialogue control with people, and motion 
control of the speaker-recognition section 61 and the face recognition section 62. 
[0083] (4) Explain the concrete configuration of the speech recognition section 
60 for embodying the concrete configuration, next the above identifier learning 



functions of the speech recognition section 60 and the face recognition section 
62, and the face recognition section 62. — 
[0084] (4-1) The concrete block diagram 15 of the speech recognition section 60 
shows the concrete configuration of this speech recognition section 60. 
[0085] In this speech recognition section 60, sound signal S1B from a 
microphone 51 is inputted into the AD (Analog Digital) transducer 70. The AD 
translation section 70 samples sound signal S1B which is the analog signal 
supplied, quantizes, and carries out A/D conversion to the voice data which is a 
digital signal. This voice data is supplied to the feature-extraction section 71. 
[0086] The feature-extraction section 71 performs for example, MFCC (Mel 
Frequency Cepstrum Cofficient) analysis for every suitable frame about the 
voice data inputted there, and outputs MFCC obtained as a result of the analysis 
to the matching section 72 and the non-registered word section processing 
section 76 as a feature vector (feature parameter). In addition, in the 
feature-extraction section 71, it is possible to extract after that, for example, 
linear predictor coefficients, a cepstrum multiplier, a line spectrum pair, the 
power (output of a filter bank) for every predetermined frequency, etc. as a- 
feature vector. 

[0087] the voice (input voice) inputted into the microphone 51 while the matching 
section 72 referred to the sound model storage section 73, the dictionary storage 



section 74, and the syntax storage section 75 if needed using the feature vector 
from the feature-extraction section 71 - for example, the continuous distribution 
HMM (Hidden Markov Model) — speech recognition is carried out based on law. 
[0088] That is, the sound model storage section 73 has memorized the sound 
model (for example, the standard pattern used for DP (Dynamic Programing) 
matching besides HMM is included) showing the description acoustical about 
subWORD, such as each phoneme in the audio language which carries out 
speech recognition, and syllable, a phoneme, in addition - here - continuous 
distribution HMM - since it is carrying out performing speech recognition based 
on law, HMM (Hidden Markov Model) is used as a sound model. 
[0089] The dictionary storage section 74 recognizes the word dictionary in whictL 
the information (sound information) about the pronunciation of the word 
clustered for every unit for recognition and the header of the word were matched. 
[0090] Here, drawing 16 shows the word dictionary memorized by the dictionary 
storage section 74. 

[0091] As shown in drawing 16 , in the word dictionary, the header and its 
phoneme sequence of a word are matched and the phoneme sequence is 
clustered for every corresponding word. In the word dictionary of drawing 16 , 
one entry (one line of drawing 16 ) is equivalent to one cluster. 
[0092] In addition, in drawing 16 , the header is expressed in a Roman alphabet 



and Japanese (kana kanji), and the phoneme sequence is expressed with the 
Roman alphabet. However, "N" in a phoneme sequence expresses a syllabic 
nasal "**." Moreover, in drawing 16 , although one phoneme sequence is 
described to one entry, it is also possible to describe two or more phoneme 
sequences to one entry. 

[0093] Return and the syntax storage section 26 have memorized the syntax rule 
which described how it was that each word registered into the word dictionary of 
the dictionary storage section 25 carries out a chain (connected) to drawing 4 . 
[0094] Here, drawing 17 shows the syntax rule memorized by the syntax storage 
section 75. In addition, the syntax rule of drawing 17 is described by EBNF 
(Extended Backus Naur Form). 

[0095] In drawing 17 , even";" which appears in the beginning from the head of 
the sentence expresses one syntax rule. Moreover, the alphabet (train) by whictL. 
"$" was given to the head expresses a variable, and the alphabet (train) to which 
"$" is not given expresses the header (header in the Roman alphabet shown in 
drawing 16 ) of a word. Meaning that the part furthermore surrounded by Q can 
be omitted, "|" means choosing either of the words (or variable) of the header 
arranged before and behind that. 

[0096] therefore, drawing 17 — setting — for example, the syntax rule 
"$col=[Kono|sono] iro wa;" of the 1st line (from a top to the 1st line) - variable 



$coI - this - be (color) - " - or the - be (color) - " - ** - it expresses 
that it is the word train to say. 

[0097] In addition, in the syntax rule shown in drawing 17 , although variable $sil 
and $garbage are not defined, variable $sil expresses a silent sound model 
(silent model), and, fundamentally, variable $garbage expresses the GABEJI 
model which permitted the free transition between phonemes. 
[0098] Again, in drawing 15 , by referring to the word dictionary of the dictionary 
storage section 74, return and the matching section 72 are connecting the souncT 
model memorized in the sound model storage section 73, and constitute the 
sound model (word model) of a word, the word model which connected the 
matching section 72 by referring to the syntax rule memorized by the syntax 
storage section 75 in some word models, and was furthermore connected by 
making it such — using — a feature vector — being based — continuous 
distribution HMM — the voice inputted into the microphone 51 is recognized by 
law. That is, the matching section 72 detects the sequence of a word model with 
the most expensive score (likelihood) with which the feature vector of the time 
series which the feature-extraction section 71 outputs is observed, and outputs 
the header of the word train corresponding to the sequence of the word model as 
a recognition result of audio. 

[0099] the word model which connected the matching section 72 with the word 



corresponding to the connected word model, and was more specifically 
connected by making it such ~ using — a feature vector — being based - 
continuous distribution HMM — the voice inputted into the microphone 51 is 
recognized by law. That is, the matching section 72 detects the sequence of a 
word model with the most expensive score (likelihood) with which the feature 
vector of the time series which the feature-extraction section 71 outputs is 
observed, and outputs the header of the word train corresponding to the 
sequence of the word model as a speech recognition result. 
[0100] More specifically, the matching section 72 outputs the header of the word 
train which accumulates the appearance probability (output probability) of each 
feature vector, and makes the score the highest by using the accumulation value 
as a score about the word train corresponding to the connected word model as a 
speech recognition result. 

[0101] The speech recognition result inputted into the microphone 51 outputted 
as mentioned above is outputted to the dialogue control section 63 as 
character-string data D1. 

[0102] Although there is syntax rule (suitably henceforth the regulation for 
non-registered words) "$pat1=$colorl $garbage $color2;" using variable 
$garbage which expresses a GABEJI model to the 9th line (from a top to the 9th 
line) with the gestalt of operation of drawing 17 here The matching section 72 



detects the voice section corresponding to variable $garbage as the voice 
section of a non-registered word, when [ this ] it sees and the regulation for 
registration words is applied. Furthermore, the matching section 72 detects the 
phoneme sequence as transition of the phoneme in the GABEJI model whicfr 
variable $garbage when the regulation for non-registered words is applied 
expresses as a phoneme sequence of a non-registered word. And the matching 
section 72 supplies the voice section and the phoneme sequence of a 
non-registered word which are detected when the speech recognition result to 
which the regulation for non-registered words was applied is obtained to the 
non-registered word section processing section 76. 

[0103] In addition, according to the above-mentioned regulation for 
non-registered words "$pat1=$colorl $garbage $color2;" Although one 
non-registered word between the phoneme sequence of the word (train) 
registered into the word dictionary expressed with variable #color1 and the 
phoneme sequence of the word (train) registered into the word dictionary 
expressed with variable $color2 is detected In the gestalt of this operation, even 
if it is the case where two or more non-registered words are included in utterance, 
and the case where it is not inserted between the words (train) by which the 
non-registered word is registered into the word dictionary, it is applicable. 
[0104] The non-registered word section processing section 76 stores temporarily 



the sequence (feature-vector sequence) of the feature vector supplied from the 
feature-extraction section 71. Furthermore, the non-registered word section 
processing section 76 will detect the feature-vector sequence of the voice in the 
voice section from the feature-vector sequence stored temporarily, if the voice 
section and the phoneme sequence of a non-registered word are received from 
the matching section 72. And the non-registered word section processing^ 
section 76 gives unique ID (identification) to a phoneme sequence 
(non-registered word) from the matching section 72, and supplies it to the 
feature-vector buffer 77 with the phoneme sequence of a non-registered word, 
and the feature-vector sequence in the voice section. 

[0105] The feature-vector buffer 77 matches and stores temporarily ID, phoneme 
sequence, and feature-vector sequence of the non-registered word supplied 
from the non-registered word section processing section 76, as shown in 
drawing 18 . 

[0106] In drawing 18 , several sequential 00 from 1 are attached as ID to the 
non-registered word here. [ when it follows, for example, ID, phoneme sequence, 
and feature-vector sequence of a non-registered word of N individual are now 
memorized in the feature-vector buffer 77 ] When the matching section 72 
detects the voice section and the phoneme sequence of a non-registered word, 
in the non-registered word section processing section 76 N+1 is attached as ID 



to the non-registered word, and in the feature-vector buffer 77, as a dotted line- 
shows to drawing 18 , ID, phoneme sequence, and feature-vector sequence of 
the non-registered word are memorized. 

[0107] Return and the clustering section 78 calculate the score to each of other 
already memorized non-registered word (suitably henceforth the word 
non-registered [ memorized ]) to the feature-vector buffer 77 again at drawing 15 
about the non-registered word (suitably henceforth a new sheep registration 
word) newly memorized by the feature-vector buffer 77. 

[0108] That is, the clustering section 78 carries out [ voice / input ] a new sheep 
registration word, and considers that the word non-registered [ memorized ] is 
the word registered into the word dictionary, and calculates the score to each 
****** non-registered language about a new sheep registration word as well as 
the case in the matching section 72. The clustering section 78 connects a sound 
model according to the phoneme sequence of the word non-registered 
[ memorized ], and, specifically, calculates the score as likelihood with which the 
feature-vector sequence of a new sheep registration word is observed from the- 
connected sound model while it recognizes the feature-vector sequence of a 
new sheep registration word by referring to the feature-vector buffer 77. 
[0109] In addition, that the sound model is remembered to be by the sound 
model storage section 73 is used. 



[0110] Similarly, the clustering section 78 also calculates the score to a new 
sheep registration word about each ****** non-registered language, and updates 
the score sheet memorized by the score sheet storage section 79 with the score^ 
[0111] Furthermore, the clustering section 78 detects the cluster which adds a 
new sheep registration word as a new member by referring to the updated score 
sheet out of the cluster which clustered the non-registered word (word 
non-registered [ memorized ]) already searched for. Furthermore, the clustering 
section 78 considers as the new member of the cluster which detected the new 
sheep registration word, divides the cluster based on the member of the cluster, 
and updates the score sheet memorized by the score sheet storage section 79 
based on the division result. 

[0112] The score sheet storage section 79 memorizes the score sheet with 
which the score to the word about a new sheep registration word non-registered 
[ memorized ], the score to the new sheep registration word about the word 
non-registered [ memorized ], etc. were registered. 
[01 13] Here, drawing 19 shows the score sheet. 

[0114] A score sheet consists of entries "ID", a "phoneme sequence", a "cluster 
number", the "representation member ID", and a "score" of a non-registered_ 
word were described to be. 

[0115] The thing same as "ID" and a "phoneme sequence" of a non-registered 



word as what was memorized by the feature-vector buffer 77 is registered by the 
clustering section 78. A "cluster number" is a figure for specifying the cluster 
from which the non-registered word of the entry serves as a member, is attached 
by the clustering section 78 and registered into a score sheet. "The 
representation number ID n is non-registered ID as a representation membej^ 
representing the cluster from which the non-registered word of that entry serves 
as a member, and can recognize the representation member of the cluster from 
which the non-registered word serves as a member by this representation 
member ID. In addition, the representation member of a cluster is called for by 
the clustering section 29, and ID of the representation member is registered into 
the representation member ID of a score sheet. A "score" is a score to each of 
other non-registered word about the non-registered word of the entry, and as 
mentioned above, it is calculated by the clustering section 78. 
[0116] For example, now, in the feature-vector buffer 77, supposing ID, 
phoneme sequence, and feature-vector sequence of a non-registered word of N 
individual are memorized, ID of the non-registered word of the N individual, the 
phoneme sequence, the cluster number, the representation number ID, and the 
score are registered into the score sheet. 

[0117] And in the clustering section 78, if ID, phoneme sequence, and 
feature-vector sequence of a new sheep registration word are newly memorized^ 



by the feature-vector buffer 77, as a score sheet shows drawing 19 by the dotted 
line, it will be updated by it. 

[0118] Namely, a score [ as opposed to each word about ID of a new sheep 
registration word, a phoneme sequence, a cluster number, the representation 
member ID, and a new sheep registration word non-registered / memorized / in a 
score sheet ] (the scores s (1 N+1) and s (2 N+1) in drawing 19 and -s (N+1, N) 
are added.) Furthermore, the score (s (1 N+ 1) in drawing 19 , s (2 N+1), ~s^ 
(N+1, N)) to the new sheep registration word about each word non-registered 
[ memorized ] is added to a score sheet. The cluster number of a non-registered 
word and the representation member ID in a score sheet are changed if needed 
so that it may furthermore mention later. 

[0119] In addition, in the gestalt of operation of drawing 19 , ID has expressed 
the score [ as opposed to the non-registered word (phoneme sequence) of j in 
ID ] about the non-registered word (utterance) of i as s (i, j). 
[0120] Moreover, the score [ as opposed to / ID / the non-registered word 
(phoneme sequence) of i in ID about the non-registered word (utterance) of i ] s 
(i, j) is registered into a score sheet ( drawing 19 ). However, in the matching 
section 72, since this score s (i, j) is calculated when the phoneme sequence of 
a non-registered word is detected, it is not necessary to calculate it in the 
clustering section 78. 



[0121] Return and the maintenance section 80 update again the word dictionary 
memorized by the score sheet in the dictionary storage section 74 based on the~ 
score sheet after updating in the storage section 79 to drawing 15 . 
[0122] Here, the representation member of a cluster is determined as follows. 
That is, let what makes max total (the average value which did the division of the 
total by the number of other non-registered words is sufficient in addition to this) 
of the score about each of other non-registered word, for example among the 
non-registered words used as the member of a cluster be the representation 
member of the cluster. Therefore, when it is expressing with k the member ID of 
the member which belongs to a cluster in this case, it is a degree type [0123]. 
[Equation 1] 



[0124] The member which comes out and sets the value k (**k) shown to ID will 
be made into a representation member. 

[0125] however, (1) type - setting - maxk{} - {} - k which makes an inner value 
max is meant. Moreover, k3 means ID of the member belonging to a cluster like 
k. Furthermore, sigma means the total to which k3 continues and changes 





[ make ] to ID of all the members belonging to a cluster. 

[0126] In addition, when determining a representation member as mentioned 
above and the member of a cluster is 1 or two non-registered words, in deciding 
a representation member, it is not necessary to calculate a score. That is, when- 
the member of a cluster is one non-registered word, the one non-registered word 
serves as a representation member, and when the members of a cluster are two 
non-registered words, it is good also considering any of the two non-registered 
words as a representation member. 

[0127] Moreover, the decision approach of a representation member can also 
make into the representation member of the cluster what makes min total of the 
distance in feature-vector space with each of other non-registered word among 
the non-registered words which are not limited to what was mentioned above 
and serve as a member of a cluster. 

[0128] In the speech recognition section 60 constituted as mentioned above, 
speech recognition processing which recognizes the voice inputted into the 
microphone 51 , and non-registered word processing about a non-registered 
word are performed according to the speech recognition procedure RT 2 shown 
in drawing 20 . 

[0129] In practice, in the speech recognition section 60, if the feature-extractiorr- 
section 71 is given it, sound signal S1B obtained when people spoke being used 



as voice data through the AD translation section 70 from a microphone 51, this 
speech recognition procedure RT 2 will be started in a step SP 30. 
[0130] And in the continuing step SP 31, by carrying out sonagraphy of the voice 
data per predetermined frame, the feature-extraction section 71 extracts a 
feature vector, and supplies the sequence of the feature vector to the matching 
section 72 and the non-registered word section processing section 76. — 
[0131] In continuing step S32, about the special order OBEKUTORU sequence 
from the feature-extraction section 71, the matching section 76 performs score 
count, as mentioned above, and it outputs it in quest of the header of the word 
train which brings a speech recognition result in step S33 based on the score 
obtained as a result of score count after this. 

[0132] Furthermore, the matching section 72 judges whether the non-registered 
word was included in a user's voice in continuing step S34. 
[0133] Here, in this step S34, when judged with the non-registered word not 
being included in a user's voice (i.e., when a speech recognition result is 
obtained without applying the above-mentioned regulation for non-registered 
words "$pat1 =$colorl $garbage $color2; ,f ), it progresses to step S35 and 
processing is completed. 

[0134] On the other hand, when judged with the non-registered word being 
included in a user's voice in step S34, When the regulation for non-registered 



words "$pat1=$colorl $garbage $color2; M is applied and a speech recognition 
result is obtained, namely, the matching section 23 In continuing step S35, while 
detecting the voice section corresponding to variable $garbage of the regulation 
for non-registered words as the voice section of a non-registered word The 
phoneme sequence as transition of the phoneme in the GABEJI model which the 
variable Sgarbage expresses is detected as a phoneme sequence of a 
non-registered word, the voice section and the phoneme sequence of the 
non-registered word are supplied to the non-registered word section processing 
section 76, and processing is ended (step SP 36). 

[0135] On the other hand, the non-registered word engine processing section 76 
will detect the feature-vector sequence of the voice in the voice section, if the 
feature-vector sequence supplied from the feature-extraction section 71 is 
stored temporarily and the voice section and the phoneme sequence of a 
non-registered word are supplied from the matching section 72. Furthermore, 
the non-registered word section processing section 76 gives ID to a 
non-registered word (phoneme sequence) from the matching section 72, and 
supplies it to the feature-vector buffer 77 with the phoneme sequence of a 
non-registered word, and the feature-vector sequence in the voice section. 
[0136] If ID, phoneme sequence, and feature-vector sequence of a new 
non-registered word (new sheep registration word) are memorized by the 

j 



feature-vector buffer 77 as mentioned above, processing of a non-registered 
word will be performed after this according to the non-registered 
word-processing procedure RT 3 shown in drawing 21 . 

[0137] That is, in the speech recognition section 60, if ID, phoneme sequence, 
and feature-vector sequence of a new non-registered word (new sheep 
registration word) are memorized by the feature-vector buffer 77 as mentioned 
above, this non-registered word-processing procedure RT 3 will be started in a 
step SP 40, and in step S41, the clustering section 78 reads ID and the 
phoneme sequence of a new sheep registration word from the feature-vector 
buffer 77 first. — 
[0138] Subsequently, in step S42, when the clustering section 78 refers to the 
score sheet of the score sheet storage section 30, it judges whether the cluster 
already called for (generated) exists. 

[0139] and the cluster already called for in this step S42, when judged with not 
existing namely, when a new sheep registration word is first non-registered word 
and the entry of the word non-registered [ memorized ] does not exist in a score 
sheet The information progress to step S43, and the clustering section 78 newly 
generates the cluster which makes the new sheep registration word a 
representation member, and concerning the new cluster, A score sheet is 
updated by registering the information about a kind registration word into the 



score sheet of the score sheet storage section 79. 

[0140] That is, the clustering section 78 registers into a score sheet ( drawing 
19 ) ID and the phoneme sequence of a new sheep registration word which were 
read from the feature-vector buffer 77. Furthermore, the clustering section 78_ 
generates a unique cluster number, and registers it into a score sheet as a 
cluster number of a new sheep registration word. Moreover, the clustering 
section 78 registers ID of a new sheep registration word into a score sheet as a 
representation number ID of the new sheep registration word. Therefore, a new 
sheep registration word serves as a representation member of a new cluster in 
this case. 

[0141] In addition, since the word which calculates a score with a new sheep 
registration word non-registered [ memorized ] does not exist in now, count of a 
score is not performed. 

[0142] After processing of this step S43 progresses to step S52, and based on 
the score sheet updated at step S43, the maintenance section 80 updates the 
word dictionary of the dictionary storage section 74, and ends processing (step 
SP 54). 

[0143] That is, since a new cluster is generated in now, the maintenance section 
31 recognizes the newly generated cluster with reference to the cluster numbej^ 
in a score sheet. And the maintenance section 80 adds the entry corresponding 



to the cluster to the word dictionary of the dictionary storage section 74, and 
registers the phoneme sequence of a new sheep registration word the new 
phoneme sequence of the representation member of a cluster, i.e., the case of 
now, as a phoneme sequence of the entry. 

[0144] When judged with on the other hand the cluster already called for existing 
in step S42, A new sheep registration word is not first non-registered word. 
Namely, on therefore, a score sheet ( drawing 19 ) When the entry (line) of the 
word non-registered [ memorized ] exists, it progresses to step S44, and the 
clustering section 78 calculates the score to a new sheep registration word about 
each ****** each of non-registered language while calculating the score to each 
****** each of non-registered language about a new sheep registration word. 
[0145] When the word of 1 thru/or N individual non-registered [ memorized ] 
exists and ID, for example, sets ID of a new sheep registration word to N+1 now, 
namely, in the clustering section 78 The scores s (1 N+ 1), s (1 N+ 2), ~, s (NL. 
N+1) to each word of N individual about the new sheep registration word of the 
part shown by the dotted line in drawing 19 non-registered [ memorized ], The 
scores s (1 N+1), s (2 N+1), — , s (N, N+1) to the new sheep registration word 
about each word of N individual non-registered [ memorized ] are calculated. In 
addition, in the clustering section 78, in calculating these scores, the 
feature-vector sequence of a new sheep registration word and each word of N 



individual non-registered [ memorized ] is needed, but these feature-vector 
sequences are recognized by referring to the feature-vector buffer 28. 
[0146] And the clustering section 78 adds the calculated score to a score sheet 
( drawing 19 ) with ID and the phoneme sequence of a new sheep registration 
word, and progresses to step S45. 

[0147] At step S45, the clustering section 78 detects the cluster which has the 
representation member which makes the highest (greatly) the score s about a 
new sheep registration word (N+1, i) (i= 1, 2, --, N) by referring to a score sheet 
( drawing 19 ). That is, by referring to the representation member ID of a score 
sheet, the clustering section 78 recognizes the word used as a representation 
member non-registered [ memorized ], is referring to the score of a score sheet 
further, and detects the word as a representation member which makes the 
highest the score about a new sheep registration word non-registered 
[ memorized ]. And the clustering section 78 detects the cluster of the cluster 
number of the word as the detected representation member non-registered 
[ memorized ]. 

[0148] Then, it progresses to step S46 and the clustering section 29 is added to 
the member of the cluster (suitably henceforth a detection cluster) which 
detected the new sheep registration word at step S45. That is, the clustering 
section 78 writes in the cluster number of the representation member of a 



detection cluster as a cluster number of the new sheep registration word in a 
score sheet. 

[0149] And in step S47, the clustering section 78 performs cluster division^ 
processing in which a detection cluster is divided into two clusters, and 
progresses to step S48. At step S48, by cluster division processing of step S47, 
the clustering section 78 judges whether the detection cluster was able to be 
divided into two clusters, and when [ which was able to be divided ] it judges, it 
progresses to step S49. At step S49, the clustering section 78 finds the distance 
between clusters between two clusters (these two clusters are hereafter called 
suitably the 1st child cluster and 2nd child cluster) obtained by division of a 
detection cluster. 

[0150] Here, it is defined as the distance between clusters between the 1st and 
2nd child clusters as follows, for example. 

[0151] That is, while expressing ID of the member (non-registered word) of the 
arbitration of both the 1st child cluster and the 2nd child cluster with k, when ID 
of the representation member (non-registered word) of the 1st and 2nd child 
cluster is expressing with k1 or k2, respectively, it is a degree type [0152]. 
[Equation 2] 

D(kl t k2)=maxval k j abs(log(s(k, kl))-log( 



[0153] It comes out and let the value D (k1, k2) expressed be the distance 
between clusters between the 1st and 2nd child cluster. 

[0154] However, in (2) types, abs() expresses the absolute value of the value in^ 
(). moreover, maxvalk - {-- 0 which} changes k and is called for - the maximum 
of an inner value is expressed. Moreover, log expresses a natural logarithm or a 
common logarithm. 

[0155] When it is that ID expresses the member of i as member #l now, it is 
equivalent to the distance of member #k and the representation member k1 , and 
1/s (k, k1) of inverse numbers of the score in (2) types is equivalent to the 
distance of member #k and the representation member k2 1/s (k f k2) of inverse 
numbers of a score. Therefore, according to the (2) types, maximum of the 
distance of representation member #k1 of the 1st child cluster and the difference 
of representation member #k2 of the 2nd child cluster will be made into the child 
distance between clusters between the 1st and 2nd child cluster among the 
members of the 1st and 2nd child cluster. 

[0156] In addition, by not being limited to what was mentioned above and 
performing DP matching of the representation member of the 1st child cluster, 
and the representation member of the 2nd child cluster in addition, the addition^ 



value of the distance in feature-vector space is calculated, and distance between 
clusters can also make the addition value of the distance into distance between 
clusters. 

[0157] After processing of step S49, it progresses to step S50 and, as for the 
clustering section 78, the distance between cluster ** of the 1st and 2nd child 
cluster judges whether it is size (or it is beyond the threshold xi) from the 
predetermined threshold xi. 

[0158] In step S50, when it judges that distance between clusters is size from the 
predetermined threshold xi (i.e., when it is thought that two or more 
un-registering-as member of detection cluster back is what should say from the 
acoustical description and should be clustered to two clusters), it progresses to 
step S51 and the clustering section 78 registers the 1st and 2nd child cluster into 
the score sheet of the score sheet storage section 79. 

[0159] That is, the clustering section 78 assigns a unique cluster number to the 
1st and 2nd child cluster, and although clustered by the 2nd child cluster, it 
updates a score sheet so that a cluster number may be made into the cluster 
number of the 2nd child cluster, while it makes a cluster number the cluster 
number of the 1st child cluster, although it was clustered by the 1st child cluster 
among the members of a detection cluster. 

[0160] Furthermore, the clustering section 78 updates a score sheet so that the 



representation member ID of the member clustered by the 2nd child cluster may 
be set to ID of the representation member of the 2nd child cluster, while setting 
to ID of the representation member of the 1st child cluster the representation 
member ID of the member clustered by the 1st child cluster. 
[0161] In addition, it is possible to assign either KU r SUTANAMBA of a detection 
cluster among the 1st and 2nd child cluster. 

[0162] If the clustering section 78 registers the 1st and 2nd child cluster into a 
score sheet as mentioned above, it progresses to S52 from step S51, and based 
on a score sheet, the maintenance section 80 will update the word dictionary of 
the dictionary storage section 74, and will end processing (step SP 54). 
[0163] That is, since the detection cluster was divided into the 1st and 2nd child 
cluster in now, the maintenance section 80 deletes the entry corresponding te- 
the detection cluster in a word dictionary first. Furthermore, the maintenance 
section 80 adds two entries corresponding to each 1st and 2nd child cluster to a 
word dictionary, and as a phoneme sequence of the entry corresponding to the 
1st child cluster, it registers the phoneme sequence of the representation 
member of the 2nd child cluster as a phoneme sequence of the entry 
corresponding to the 2nd child cluster while it registers the phoneme sequence 
of the representation member of the 1st child cluster. 

[0164] On the other hand, it sets to step S48. By cluster division processing of 



step S47 In the step S50 when judged with the ability of a detection cluster to 
have not been divided into two clusters When it judges that the distance 
between clusters of the 1st and 2nd child cluster is not size from the 
predetermined threshold xi, Therefore, when it is not what is not alike, so that the 
acoustical description after two or more un-registering as a member of a 
detection cluster clusters to the 1st and 2nd child cluster, Progressing to step 
S53, the clustering section 78 asks for the new representation member of a 
detection cluster, and updates a score sheet. 

[0165] That is, the clustering section 78 recognizes the score s required for 
count of (1) type (k3, k) by referring to the score sheet of the score sheet storage 
section 79 about each member of the detection cluster which added the new 
sheep registration back as a member. Furthermore, clustering 78 calculates ID 
of the new representation member of a detection cluster, and the becoming 
member based on (1) type using the recognized score s (k3, k). And the_ 
clustering section 78 rewrites the representation member ID of each member of 
the detection cluster in a score sheet ( drawing 19 ) to ID of the new 
representation member of a detection cluster. 

[0166] Then, it progresses to step S52, and the maintenance section 80 updates 
the word dictionary of the dictionary storage section 74 based on a score sheet, 
and ends processing (step SP 54). 



[0167] That is, in now, by referring to a score sheet, the maintenance section 80_ 
recognizes the new representation member of a detection cluster, and 
recognizes the phoneme sequence of the DA table member further. And the 
maintenance section 80 changes the phoneme sequence of the entry 
corresponding to the detection cluster in a word dictionary into the phoneme 
sequence of the new representation member of a detection cluster. 
[0168] Here, cluster division processing of the step SP 47 of drawing 21 is 
performed according to the cluster division procedure RT 4 shown in drawing 22 . 
[0169] That is, in the speech recognition processing section 60, if it progresses 
to a step SP 47 from the step SP 46 of drawing 22 , this cluster division 
procedure RT 4 is started in a step SP 60, and first, in step S61, the clustering 
section 78 will choose the combination of two members of the arbitration which 
has not been chosen yet, and will make each a temporary representation 
member from the detection cluster to which the new sheep registration back was 
added as a member. Here, these two temporary representation members are 
hereafter called suitably the 1st temporary representation member and 2ncL 
temporary representation member. 

[0170] And in continuing step S62, the clustering section 78 judges whether the 
member of a detection cluster can be divided into two clusters so that the 1st 
temporary representation member and the 2nd temporary representation 



member can be made into a representation member, respectively. 
[0171] Here, although whether the 1st or 2nd temporary representation member 
can be made into a representation member needs to calculate (1) type, the_ 
score s (k\ k) used for this count is recognized by referring to a score sheet. 
[0172] In step S62, when it is judged with the ability of the member of a detection 
cluster not to be divided into two clusters so that the 1st temporary 
representation member and the 2nd temporary representation member can be 
made into a representation member, respectively, step S62 is skipped and it 
progresses to step S64. 

[0173] Moreover, in step S62, so that the 1st temporary representation member 
and the 2nd temporary representation member can be made into a 
representation member, respectively When judged with the ability of the member 
of a detection cluster to be divided into two clusters, it progresses to step S63. 
The clustering section 78 So that the 1st temporary representation member and 
the 2nd temporary representation member may turn into a representation 
member, respectively The member of a detection cluster is divided into two 
clusters, and it progresses to step S64 as a candidate (suitably henceforth the 
group of a candidate cluster) of the 1st [ which brings a division result of a_ 
detection cluster in the group of two clusters after the division ], and 2nd child 
clusters. 



[0174] At step S64, the clustering section 78 It judges whether in the member of 
a detection cluster, there is any group of two members which have not been 
chosen as a group of the 1st and 2nd temporary representation member yet. 
When it judges with it being, the group of two members, return and the detection 
cluster which has not been chosen as a group of the 1st and 2nd temporary- 
representation member yet, is chosen as step S61, and the same processing is 
repeated hereafter. 

[0175] Moreover, in step S64, when judged with there being no group of two 
members of the detection cluster which has not been chosen as a group of the 
1st and 2nd temporary representation member, it progresses to step S65 and 
the clustering section 78 judges whether the group of a candidate cluster exists. 
[0176] In step S65, when judged with the group of a candidate cluster not 
existing, the return of step S66 is skipped and carried out. In this case, in step 
S48 of drawing 21 , it is judged with the ability of a detection cluster to have not 
been divided. 

[0177] On the other hand, when judged with the group of a candidate cluster 
existing in step S65, it progresses to step S66, and the clustering section 78 
finds the distance between clusters between two clusters of the group of each 
candidate cluster, when two or more groups of a candidate cluster exist. And 
distance between clusters asks for the group of the minimum candidate cluster^ 



and carries out the division result of a detection cluster for the group of the 
candidate cluster, namely, the clustering section 78 carries out a return to the 1st 
as 2nd child cluster. In addition, only as for one case, let the group of the 
candidate cluster be the 1st and 2nd child cluster for the group of a candidate 
cluster as it is. 

[0178] In this case, in step S48 of drawing 21 , it is judged with the ability of the 
detection cluster to have been divided. 

[0179] as mentioned above, out of the cluster which clustered the non-registered 
word already searched for in the clustering section 78 Detect the cluster 
(detection cluster) which adds a new sheep registration word as a new member, 
and since the detection cluster was divided as a new member of the detection 
cluster based on the member of the detection cluster, a new sheep registration 
word It can cluster easily to those to which the acoustical description 
approximates the non-registered word. 

[0180] Registration to the word dictionary of a non-registered word can be^ 
performed easily, avoiding large-scale-ization of a word dictionary in the 
maintenance section 80, furthermore, since the word dictionary was updated 
based on such a clustering result. 

[0181] Moreover, temporarily, even if it mistakes detection of the voice section of 
a non-registered word in the matching section 72, for example, such a 



non-registered word is clustered by division of a detection cluster at a cluster 
different from the non-registered word by which the voice section was detected 
correctly. And although the entry corresponding to such a cluster will be 
registered into a word dictionary, since the phoneme sequence of this entry 
becomes a thing corresponding to the voice section which was not detected 
correctly, it does not give a big score in subsequent speech recognition: 
Therefore, even if it mistakes detection of the voice section of a non-registered 
word, the error will hardly influence subsequent speech recognition. 
[0182] Here, drawing 23 shows the clustering result obtained by uttering a_ 
non-registered word. In addition, in drawing 23 , each entry (each line) 
expresses one cluster. Moreover, the left column of drawing 23 expresses the 
phoneme sequence of the representation member (non-registered word) of each 
cluster, and the right column of drawing 23 expresses the contents of utterance 
and the number of non-registered words used as the member of each cluster. 
[0183] That is, in drawing 23 , the entry of the 1st line expresses the cluster from 
which only one utterance of a non-registered word "a bath" serves as a member, 
and the phoneme sequence of the representation member has become "doroa:" 
(drawer). Moreover, the entry of the 2nd line expresses the cluster from which 
three utterance of a non-registered word "a bath" serves as a member, for 
example, and the phoneme sequence of the representation member has 



become "kuro" (clo). 

[0184] Furthermore, the entry of the 7th line expresses the cluster from which 
four utterance of a non-registered word "a book" serves as a member, for 
example, and the phoneme sequence of the representation member has- 
become M NhoNde:su" (NHONTESU). Moreover, the entry of the 8th line 
expresses the cluster from which one utterance of a non-registered word 
"Orange" and utterance of a non-registered word "a book" of 19 serve as a 
member, for example, and the phoneme sequence of the representation 
member has become "ohoN" (OHON). Other entries express the same thing. 
[0185] According to drawing 23 , it turns out about utterance of the same 
non-registered word that it is clustered good. 

[0186] In addition, in the entry of the 8th line of drawing 23 , one utterance of a 
non-registered word "Orange" and utterance of a non-registered word "a book" 
of 19 are clustered by the same cluster. Although it is thought that this cluster 
should turn into a cluster of a non-registered word "a book" from the utterance 
used as that member, utterance of a non-registered word "Orange" also serves 
as a member of that cluster. However, this cluster will also be considered to be 
clustered by the cluster which cluster division is carried out and makes only 
utterance of a non-registered word "a book" a member, and the cluster whicfr- 
makes only utterance of a non-registered word "Orange" a member if utterance 



of a non-registered word "a book" is inputted further after that. 
[0187] (4-2) Explain the concrete configuration of the face recognition section 62, 
next the concrete configuration of the face recognition section 62. 
[0188] As shown in drawing 24 and drawing 25 , although the face recognition 
section 62 can answer in fixed time amount under the environment where it^ 
changes dynamically, it consists of the face extract processing section 90 which 
extracts a face pattern from the inside of the image based on picture signal S1 A 
given from CCD camera 50 ( drawing 5 ), and the face recognition processing 
section 91 which recognizes a face based on the extracted face pattern. With the 
gestalt of this operation, "the support vector machine (Support Vector 
Machine:SVM)" is adopted as face recognition processing in which adopt 
"GABOA filtering (Gabor Filtering)" as face extract processing in which a face 
pattern is extracted, and a face is recognized from a face pattern. 
[0189] This face recognition section 62 has the study phase where the face 
recognition processing section 91 learns a face pattern, and the recognition 
phase of recognizing the face pattern extracted from picture signal S1A based 
on the learned data. 

[0190] The structure of the study phase of the face recognition section 62 is 
shown in drawing 24 , and the configuration of the recognition phase of the face 
recognition section 62 is shown in drawing 25 , respectively. _ 



[0191] In a study phase, as shown in drawing 24 , it is supplied to the face 
recognition processing section 91 which the result of having carried out the face 
extract of a user's image pick-up image inputted from CCD camera 50 ( drawing 
5 ) in the face extract processing section 90 which becomes with a GABOA filter 
becomes with a support vector machine. In the face recognition processing 
section 91, a provisional discriminant function is obtained using the data, i.e., the 
teacher data, for study supplied from the outside. — 
[0192] Moreover, in a discernment phase, as shown in drawing 25 , the result of 
having carried out the face extract of the face of the man in the image based on 
picture signal S1A supplied from CCD camera 50 in the face extract processing 
section 90 is supplied to the face recognition processing section 91. In the face 
recognition processing section 91, the discriminant function obtained 
provisionally is tried on the image on various databases, and a face is detected. 
And what succeeded in detection is outputted as face data. Moreover, it adds to 
study data by using as non-face data what failed in detection, and study is done 
further again. 

[0193] Hereafter, the GABOA filtering processing in the face extract processing 
section 90 and the support vector machine in the face recognition processing 
section 91 are explained to a detail, respectively. 

[0194] (4-2-1) It has already turned out that the cell which has selectivity in 



GABOA filtering processing human being's vision cell to a certain specific 
bearing exists. This consists of a cell which reacts to a perpendicular line, and a 
cell reacted to a level line. GABOA filtering is a spatial filter which consists of two 
or more filters with orientation selectivity like this. 

[0195] The space expression of the GABOA filter is carried out by the GABOA 
function. The GABOA function g (x y) is a degree type [0196]. 
[Equation 3] 

g(x, y) = s(x, y)wr (x, y) 

[0197] It is alike, and it consists of Carriers s (x y) and the EMBE lobes wr (x y) of 
the letter of two-dimensional gauss analysis which consist of a cosine 
component so that it may be shown. 

[0198] Carrier s (x y) is expressed like a bottom type (4) using two or more 
functions. Here, a coordinate value (uO, vO) expresses spatial frequency, and P 
expresses the phase of a cosine component. 
[0199] It is here and is a degree type [0200]. 
[Equation 4] 

s(x, y)=exp( j(2 it (u 0 x + v 0 y) H-P)) 



[0201] The carrier boiled and shown is a degree type [0202]. 
[Equation 5] 

Re(s(x, y)) = cos (2 n (u D x + v 0 y) +P) 
Im(s(x, y)) = sin(2^ (uox-h v 0 y) -i-P) 

[0203] It is a real component Re (it is separable into s (x y) and an imaginary 
component Im (s (x y)).) so that it is alike and may be shown. 
[0204] On the other hand, the EMBE lobe which consists of two-dimensional 
Gaussian distribution is a degree type [0205]. — 
[Equation 6] 

w r Cx, y)=K exp(— jt (a 2 Cx — x 0 ) r 2 + b 2 < 



[0206] ** - it is expressed like. 

[0207] Here, an axis of coordinates (xO, yO) is the peak of this function, and 
constants a and b are the scale parameters of Gaussian distribution. Moreover, 
suffix r is a degree type [0208]. 
[Equation 7] 



Cx — Xo) r = (x — Xo)cos# -r(y — vo)sin9 
(y-yo) r = - (x-xo)sin 6 + (y-yo)cos 0 



[0209] It is alike and rotation actuation as shown is meant. 

[0210] Therefore, a GABOA filter is a degree type [021 1] from above-mentioned 

(4) types and (6) types. 

[Equation 8] 

g(x,y)=K exp( — 7z-(a 2 (x — x u ) r 2 +b 2 (y- 

exp(j(2^ (uoX + u n y) + P)) 

[0212] It is alike and is expressed as a space function as shown. — 
[0213] The direction of eight kinds and three kinds of^equencies are used for 
the face extract processing section 90 concerning the gestalt of this operation, 
and it performs face extract processing using a total of 24 GABOA filters. 
[0214] Gi is used as the i-th GABOA filter, the result (Gabor Jet) of the i-th 
GABOA is set to Ji, and an input image is set to I, then the response of a 
GABOA filter is a degree type [0215]. 
[Equation 9] 



J i (x, y)=G ■ (x, y) ©I(x, y) 



[0216] It is come out and expressed. The operation of this (9) type is accelerable 
using a fast Fourier transform in fact. 

[0217] In order to investigate the engine performance of the created GABOA 
filter, it carries out by reconstructing the pixel filtered and obtained. The 
reconstructed image H is a degree type [0218]. 
[Equation 10] 

0 

H(x, y)= 2a i J , (x, y) 

i - 1 

[021 9] ** — it is expressed like. 

[0220] And the error E with the input image I and the reconstructed image H is a 
degree type [0221]. 
[Equation 11] 

1 1 

B= || I(x,y)-H(x,y)|| - S(Kx 

2 2 



[0222] ** - it is expressed like. 

[0223] It is reconstructible by asking for optimal a which makes this error E min. 
[0224] (4-2-2) Identify that it is the face which corresponds using the support 
vector machine (SVM) made the highest [ study generalization capacity ] in the 
field of pattern recognition about the face recognition in the face recognition 
processing section 91 with the gestalt of support vector machine book operation. 
[0225] About the SVM itself, the report (B. Sholkopf, C.Burges, A.Smola, 
"Advance in Kernel Support Vector Learning", The MIT Press, 1999.) of the work 
outside B. sholkopf can be mentioned, for example. The result of the preliminary 
experiment which the applicant for this patent conducted shows that the face 
recognition approach by SVM shows a good result compared with the technique 
of using principal component analysis (PCA) and a neural network. 
[0226] SVM is the learning machine which used the linearity discrimination circuit 
(perceptron) for the discriminant function, and it can extend to nonlinear space 
by using a kernel function. Moreover, in study of a discriminant function, it is 
carried out so that the margin of separation between classes may be taken tQ_ 
max, and since the solution is acquired by solving secondary mathematical 
programming, it can guarantee theoretically that a global solution can be 
reached. 



[0227] Usually, the problem of pattern recognition is test sample x= (x1 , x2~.). It 
is a degree type [0228] to xn). 
[Equation 12] 

f (x)= Sw j x j +b 
j i 

[0229] It is asking for discriminant function f (x) come out of and given. 

[0230] Here, it is a degree type [0231] about the teacher label for study of SVM. 

[Equation 1 3] 

y=(yl, y2, , yn) 

[0232] ** -- it sets like. 

[0233] Then, it is a degree type [0234] about recognition of the face pattern in 
SVM. 

[Equation 14] 

yi(w r x i +b)zl 



[0235] It can regard as a problem which is boiled and the square of the weight 
factor w under the shown constraint minimizes. 

[0236] The problem which such constraint attached can be solved using 
Lagrange's undecided constant method. Namely, a degree type [0237] 
[Equation 15] 

i i 

L(w. b, a)= I I w I I 2 -2 a, (y, «x ( ' 

2 



[0238] It is alike, Lagrange who shows is introduced first, and, subsequently it is 
a degree type [0239]. 
[Equation 16] 

31 dl 

= =0 

3 b d w 



[0240] It is alike, and a partial differential is carried out about each of b and w so 
that it may be shown. 

[0241] Consequently, it is discernment of the face pattern in SVM [0242] 
[Equation 17] 



max Zai — 



1 



Saiaiyiyi' xj 



$iJ#Jffe# : ai^O,Saiyi=0 



[0243] It can be alike and can regard as secondary shown plan problems. 
[0244] When there are few number of dimensions of a feature space than the 
number of training samples, the scratch variable xi>=0 is introduced, and it is a 
degree type [0245] about a constraint. 
[Equation 18] 



y 



[0246] ** — it changes like. 



[0247] About optimization, it is a degree type [0248]. 



[Equation 19] 



i 




[0249] ********** is minimized. 

[0250] In this (19) type, C is a multiplier which specifies how far a constraint is" 
loosened, and needs to determine a value experimentally. 
[0251] The problem about the number a of Lagrange is a degree type [0252]. 
[Equation 20] 



[0253] ** - it is changed like. 

[0254] However, a non-line type problem cannot be solved with this (20) type. So, 
with the gestalt of this operation, the kernel function K (x x3) is introduced, it 
once maps to the space of high order origin (kernel trick), and linearity 
separation is carried out in the space. Therefore, in the original space, it 
becomes equivalent to having line [ non-]-type-dissociated. 
[0255] A certain map phi is used for a kernel function, and it is a degree type 
[0256]. 
[Equation 21] 

K(x, y)= <2>(x) ' <J>(x l ) 



[0257] ** - it is expressed like. 

[0258] Moreover, the discriminant function shown in (12) types is also a degree 
type [0259]. 

[Equation 22] __ 

f(Q(x))=w r cp(x) +b 

= ZaiyiKCx, xi) +b (22) 

[0260] ** - it can express like. 

[0261] Moreover, it is also related with study and is a degree type [0262]. 
[Equation 23] 

1 

max Zai ZaiaiyiyixjK(Xi, x j) 

2 

mthgiWr : 0 sSai2SC,2aiyi=0 (2 3) 



[0263] It can be alike and can regard as secondary shown plan problems. 
[0264] As a kernel, it is a degree type [0265]. 
[Equation 24] 



I x -x 1 | 

K(x, x')=exp 



(2 4: 



[0266] It can be alike and the shown Gaussian kernel (RBF (Radius Basic 



Function)) can be used. 

[0267] In addition, you may make it change the class of filter about GABOA 
filtering according to a recognition task. 

[0268] It is redundant to have all the images after filtering as a vector in filtering 
by low frequency. Then, a down sampling is carried out and you may make it 
drop the dimension of a vector. 24 kinds of vectors by which the down sampling 
was carried out are arranged in a single tier, and it is made a long vector. — 
[0269] Moreover, since it is the discrimination circuit which carries out a feature 
space for 2 minutes, SVM applied to recognition of a face pattern in the gestalt of 
this operation is learned so that it may distinguish "Man A" and "he not being 
Man A." Therefore, out of the image of a database, Man's A face images are 
collected first and the label "he is not Man A" is stuck on the vector after GABOA 
filtering. Generally, more ones of the number of the face images to collect than 
the dimension of a feature space are good. One discrimination circuit is similarly 
constituted to each man like "he is Man B" and - which "is not Man B" to 
recognize ten persons' face. 

[0270] By such study, the support vector which divides "he is not Man A" with 
"Man A" can be found. SVM is a discrimination circuit which divides a feature 
space into two, and when a new face image has been inputted, the vector of 
GABOA filtering outputs a recognition result too by in which of the interface 



which the support vector for which it asked constitutes it is. And to a boundary, if 
it is in the field of "Man A", it can be recognized as "Man A." Moreover, if it is the" 
field which "is not Man A", it will be recognized as "He is not Man A." 
[0271] The field cut out from the image based on picture signal S1A from CCD 
camera 50 as a part of a face is not fixed. For this reason, it may be projected on 
the point which separated with the category to recognize in a feature space. 
Therefore, a recognition rate may improve by presuming to parts with the 
descriptions, such as an eye, a nose, and opening, and carrying out morphing by 
affine transformation. 

[0272] Moreover, the bootstrap technique is employable in order to improve the 
recognition engine performance. An image is photoed apart from the image used 
for study, and it uses for a bootstrap. This means feeding the input image into a 
study set, and relearning it, when the recognition result which the learned 
discrimination circuit mistook is taken out. 

[0273] Moreover, in order to improve the recognition engine performance, there 
is also a method of seeing time amount change of a recognition result. By the 
easiest approach, when recognized as 8 times "Man A" among 10 times, it is 
recognizing it as "Man A" etc. Otherwise, the predicting method which used the 
Kalman filter is proposed. 

[0274] In actuation of the gestalt of this operation, and the configuration beyond 



effectiveness (5) By this robot 1 While acquiring the man's identifier through a 
dialogue with a new person, relating the identifier concerned with each data of 
the acoustical description of the voice of the man who detected based on the 
output of a microphone 51 or CCD camera 50, and the gestalt-description of a~ 
face and memorizing it People's identifier is learned, as an appearance of a new 
person is recognized to the pan which does not acquire the identifier based on 
the these-memorized various data, and the new person's identifier, the 
acoustical description of voice, and the gestalt-description of a face are acquired 
like **** and memorized. 

[0275] Therefore, without needing the identifier registration by the explicit 
directions from users, such as an input of a voice command, and press actuation 
of a touch sensor, this robot 1 can learn the identifier of a new person, a body, 
etc. automatically through a dialogue with the usual man so that human being 
may carry out usually. 

[0276] According to the above configuration, the man's identifier is acquired 
through a dialogue with a new person. While relating the identifier concerned 
with each data of the acoustical description of the voice of the man who detected 
based on the output of a microphone 51 or CCD camera 50, and the 
gestalt-description of a face and memorizing it Based on each these-memorizecT 
data, recognize an appearance of a new person to the pan which does not 



acquire the identifier, and acquire the new person's identifier, the acoustical 
description of voice, and the gestalt-description of a face like ****, and they are 
memorized. By having learned people's identifier, it can make it possible to learn 
the identifier of a new person, a body, etc. automatically through a dialogue with 
the usual man, and the robot which may raise entertainment nature on a target 
markedly in this way can be realized. 

[0277] (6) it is the gestalt of other operations although the case where this 
invention was applied to the robot 1 of the 2-pair-of-shoes walk mold constituted 
like drawing 1 in the gestalt of above-mentioned operation was described — this 
invention - not only this - in addition, in addition to this, it is widely applicable to 
various equipments other than various robot equipments and robot equipment. 
[0278] Moreover, by having a function for conversing with human being in the 
gestalt of above-mentioned operation, and constituting a dialogue means to 
acquire the identifier of the target body from human being through the dialogue" 
concerned from the speech recognition section 60, a dialogue control section 63, 
and the speech synthesis section 64 Although the case where people's identifier 
was acquired by the voice dialogue with people was described, you may make it 
this invention constitute a dialogue means so that people's identifier may be 
acquired by the alphabetic character dialogue not only by this but keyboard entry 
etc. 



[0279] Furthermore, this invention is replaced not only with this but with a person, 
or, in addition to a person, in addition to this, it may be made to make it into the 
object of identifier study of the various bodies of an except, although the case 
where the object of identifier study was a person was described in the gestalt of 
above-mentioned operation. 

[0280] In this case, although the case where recognized that person from the 
acoustical description of the target person's voice and the gestalt-description of 
a face, respectively, and it was distinguished in the gestalt of above-mentionecL 
operation based on these recognition results whether that person is a new 
person was described Replace this invention not only with this but with this, or, in 
addition to this, the person is recognized for an individual from two or more kinds 
of various identifiable descriptions biologically [ smells other than these (for 
example, a form) etc. ], respectively. You may make it distinguish whether the 
person is a new person based on these recognition results. Moreover, when the 
candidates for identifier study are bodies other than a person, the body is 
recognized for bodies, such as a color, a configuration, a pattern, and magnitude, 
from two or more kinds of identifiable descriptions, respectively, and you may 
make it distinguish whether the body is a new body based on these recognition 
results. And what is necessary is just to make it establish two or more 
recognition means to recognize the body made into the object concerned, based 



on the data of the description with which the detection result concerned and the 
known body memorized beforehand correspond, while detecting the 
predetermined description that bodies differ in these cases, respectively. 
[0281] Furthermore, in the gestalt of above-mentioned operation, although the 
case where memory constituted a storage means which associated the identifier 
of a known body and the recognition result of each recognition means (the 
speaker-recognition section 61 and face recognition section 62) against the body 
concerned to relate and to memorize information was described This invention 
can apply widely various storage means other than the memory which can 
memorize not only this but information (for example, a disk-like record medium- 
etc.) in addition to this. 

[0282] Furthermore, in the gestalt of above-mentioned operation, although the 
case where the recognition processing the speaker-recognition section 61 and 
the face recognition section 62 recognize the target man to be was made not to 
be performed only once was described This invention may be made to be made 
to perform recognition processing once again, not only this but when it is for 
example, recognition impossible (SID=-1), and it may be it at the times other 
than this, or it may be made to perform recognition processing of multiple times. 
The precision of a recognition result can be raised by doing in this way. 

m 

[0283] Furthermore, in the gestalt of above-mentioned operation, by the majority 



of the recognition result of the recognition means (the speech recognition 
section 60, the speaker-recognition section 61, face recognition section 62) of 
plurality [ control section / 63 / dialogue ], although the case where it was judged 
whether you are a person with the new man was described The man may be 
made, as for this invention, to judge whether you are a new person based on 
each recognition result of the recognition means of these plurality by technique 
not only this but other than majority. 

[0284] In this case, set, for example, weighting is made each recognition result 
of two or more recognition means according to the recognition engine 
performance of that recognition means. The approach of judging whether the 
target body being new based on each of that recognition result that carried out 
weighting, When it is able to be most judged as a new person based on the- 
recognition result of the high recognition means of the recognition engine 
performance, and other one recognition means, various approaches, such as 
****** which does not use the recognition result of other recognition means, can 
be applied widely. 

[0285] Furthermore, in the gestalt of above-mentioned operation, although the 
case where it was made to raise recognition precision according to statistical 
stability by [ the ] reaching speaker-recognition section 61 or making the face 
recognition section 62 carry out additional study was described when the 



speaker-recognition section 61 and the face recognition section 62 have 
recognized the target man correctly You may make it this invention prepare the 
function which may raise the dependability of the correlation information by 
memorizing the same combination repeatedly similarly about the correlation 
information stored not only in this but in the memory 65. Specifically, the 
approach using the neural network indicated by "the Institute of Electronics, 
Information and Communication Engineers paper magazine, D-ll, Vol.J82-D-lTT 
No6, pp.1 072-1 081." can be used as the embodiment approach of such a 
function. 
[0286] 

[Effect of the Invention] While detecting the predetermined description that the 
target body differs from a dialogue means to have a function for conversing with 
human being, and to acquire the identifier of the target body from human being 
through the dialogue concerned, respectively, in study equipment as mentioned 
above according to this invention Two or more recognition means to recognize 
the body made into the object concerned based on the data of the description 
with which the detection result concerned and the known body memorized 
beforehand correspond, A storage means which associated the identifier of a 
known body, and the recognition result of each recognition means against the 
body concerned to relate and to memorize information, A decision means to 



judge whether the target body is a new body based on the identifier of the body_ 
made into the object which the dialogue means acquired, the recognition result 
of each recognition means against the body made into the object concerned, 
and the correlation information that a storage means memorizes, When a 
decision means judges the target body to be a new body, while making each 
recognition means memorize the data of the description with which the body 
made into the object concerned corresponds, respectively By having established 
the control means which it relates [ control means ] about the body made into the 
object concerned, and makes a storage means memorize information The 
identifier of a new person, a body, etc. can be automatically learned through a 
dialogue with the usual man, and the study equipment which may raise 
entertainment nature on a target markedly in this way can be realized so that 
human being may carry out usually. 

[0287] Moreover, while according to this invention conversing with human being 
and acquiring the identifier of the target body from human being through the 
dialogue concerned in the study approach The 1st step which recognizes the_ 
body which detects the predetermined description that the plurality of the target 
body differs, respectively, and is made into the object concerned based on the 
detection result concerned and the data of each description of a known body 
memorized beforehand, The identifier of the body made into the acquired object, 



and each recognition result respectively based on each description of the body 
made into the object concerned, The 3rd step which relates and judges whether 
the target body for which the identifier of the known body memorized beforehandL 
and the recognition result of each recognition means against the body 
concerned were associated is a new body based on information, By having 
prepared the 4th step which relates about the data of each description of the 
body made into the object concerned, and the body made into the object 
concerned, and memorizes information, respectively, when the target body was 
judged to be a new body The identifier of a new person, a body, etc. can be 
automatically learned through a dialogue with the usual man, and the study 
approach which may raise entertainment nature on a target markedly in this way 
can be realized so that human being may carry out usually. 
[0288] While detecting the predetermined description that the target body differs 
from a dialogue means to have a function for conversing with human being, and 
to acquire the identifier of the target body from human being through the 
dialogue concerned, respectively, in robot equipment furthermore according to 
invention Two or more recognition means to recognize the body made into the 
object concerned based on the data of the description with which the detection^ 
result concerned and the known body memorized beforehand correspond, A 
storage means which associated the identifier of a known body, and the 



recognition result of each recognition means against the body concerned to 
relate and to memorize information, A decision means to judge whether the 
target body is a new body based on the identifier of the body made into the 
object which the dialogue means acquired, the recognition result of each 
recognition means against the body made into the object concerned, and the 
correlation information that a storage means memorizes, When a decision 
means judges the target body to be a new body, while making each recognition 
means memorize the data of the description with which the body made into the 
object concerned corresponds, respectively By having established the control 
means which it relates [ control means ] about the body made into the object 
concerned, and makes a storage means memorize information The identifier of 
a new person, a body, etc. can be automatically learned through a dialogue with 
the usual man, and the robot equipment which may raise entertainment nature- 
on a target markedly in this way can be realized so that human being may carry 
out usually. 
[0289] 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] It is the perspective view showing the appearance configuration of 
the robot by the gestalt of this operation. 

[Drawing 2] It is the perspective view showing the appearance configuration of 
the robot by the gestalt of this operation. 

[Drawing 3] It is the approximate line Fig. with which explanation of the 
appearance configuration of the robot by the gestalt of this operation is 
presented. 

[Drawing 4] It is the approximate line Fig. with which explanation of the internal 
configuration of the robot by the gestalt of this operation is presented. 
[Drawing 5] It is the approximate line Fig. with which explanation of the internal 
configuration of the robot by the gestalt of this operation is presented. 
[Drawing 6] It is the block diagram with which explanation of processing of the 
Maine control section 40 about an identifier learning function is presented. 
[Drawing 7] It is the conceptual diagram with which explanation of correlation" 
with FID and SID, and the identifier in memory is presented. 
[Drawing 8] It is the flow chart which shows identifier study procedure. 
[Drawing 9] It is the flow chart which shows identifier study procedure. 
[Drawing 10] It is the approximate line Fig. showing the example of a dialogue at 
the time of identifier study processing. 



[Drawing 1 1] It is the approximate line Fig. showing the example of a dialogue at 
the time of identifier study processing. 

[Drawing 12] It is the conceptual diagram with which the explanation of new 
registration with FID and SID, and an identifier is presented. 
[Drawing 13] It is the approximate line Fig. showing the example of a dialogue at 
the time of identifier study. 

[Drawing 14] It is the approximate line Fig. showing the example of a dialogue at 
the time of identifier study processing. 

[Drawing 15] It is the block diagram showing the configuration of the speech 
recognition section. ~~ 
[Drawing 16] It is the conceptual diagram with which explanation of a word 
dictionary is presented. 

[Drawing 17] It is the conceptual diagram with which explanation of the syntax 
rule is presented. 

[Drawing 18] It is the conceptual diagram with which explanation of the contents 
of storage of a feature-vector buffer is presented. 

[Drawing 19] It is the conceptual diagram with which explanation of a score 
sheet is presented. 

[Drawing 20] It is the flow chart which shows speech recognition procedure. 
[Drawing 21] It is the flow chart which shows a non-registered word-processing 



procedure. 

[Drawing 22] It is the flow chart which shows cluster division procedure. 
[Drawing 23] It is the conceptual diagram showing a simulation result. 
[Drawing 24] It is the block diagram showing the configuration of the face" 
recognition section at the time of study. 

[Drawing 25] It is the block diagram showing the configuration of the face 
recognition section at the time of recognition. 
[Description of Notations] 

1 [ .. A microphone, 54 / .. A loudspeaker, 60 / .. The speech recognition section, 
61 / .. The speaker-recognition section, 62 / .. The face recognition section, 63 / .. 
A dialogue control section, 64 / .. The speech synthesis section, 65 / .. Memory, 
S1A / .. A picture signal, S1B, S3 / .. A sound signal, D1 , D2 / .. Character-string 
data, RT1 / .. Identifier study procedure. ] .... A robot, 40 .. The Maine control 
section, 50 .. A CCD camera, 51 
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